Saturday, April 28, 2018

Link b/w client driver, device, controller driver & core driver

This blog explains the interconnection b/w client driver, controller driver & core driver.

Example: SPI-NOR FW, m25p80 client driver, SPI Xlnx controller driver

1) Client driver structure representation:-
struct m25p {
 struct spi_device *spi;
 struct spi_nor  spi_nor;
 u8   command[MAX_CMD_SIZE];
};
 
 
 
static int m25p_probe(struct spi_device *spi) {
 
struct flash_platform_data *data;
 struct m25p *flash;
 struct spi_nor *nor;
 
data = dev_get_platdata(&spi->dev); 

          flash_platform_data = spi_device->dev->platform_data;
 
/* Allocate memeory for m25p80 driver*/
 
flash = devm_kzalloc(&spi->dev, sizeof(*flash), GFP_KERNEL); 
 
/* SPI layer device & NOR FW layer device will be same  */ 
nor->dev = &spi->dev; 


spi_nor_set_flash_node(nor, spi->dev.of_node);

              nor->mtd->dev.of_node = device_node 

    /* Read whatever is there in "label" & assign it to mtd->name; */
              "label", &mtd->name 

/* This way spi-nor FW can access client device */
nor->priv = m25p80;
 
/* Device core layer driver data points to client driver structure */ 
spi_set_drvdata(spi, flash);
       spi_device->dev->driver_data = m25p80;
 
/* Connect the client driver with spi layer.
   Assign the spi_device to client driver spi_device pointer */ 
flash->spi = spi;
          m25p80->spi_device = spi_device (got as a param from **********) 
 
 
 
 

} 

  
 
struct spi_device *spi_new_device(struct spi_controller *ctlr,
      struct spi_board_info *chip)
 
struct spi_device *proxy; 
proxy = spi_alloc_device(ctlr);  
 
proxy->dev.platform_data = (void *) chip->platform_data; 



static int zynqmp_prepare_transfer_hardware(struct spi_master *master)
{
 struct zynqmp_qspi *xqspi = spi_master_get_devdata(master);
 
 
 
static int zynqmp_qspi_probe(struct platform_device *pdev)
{
 struct spi_master *master;
 struct zynqmp_qspi *xqspi;
 
xqspi = spi_master_get_devdata(master); 
        xqspi = ctlr->dev->driver_data; 
 master->dev.of_node = pdev->dev.of_node;
 
platform_set_drvdata(pdev, master);  
 
         pdev->dev->driver_data = master; 
 
 
     xqspi->dev = dev;
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Tuesday, February 14, 2017

Android AshMem

Android Shared Memory (ASHMEM)


Used to share memory b/w process

Ashmem uses virtual memory, 
Whereas PMEM uses physically contiguous memory. pmem is used to manage large (1-16+MB)

1) Ashmem introduced the concept of pining and unpinning.
       [A] Pinned pages of shared memory can not be reclaimable in              memory pressure,
       [B] Unpinned pages can be reclaimable.
       [C] Pinning/Unpinning works by manipulating the ashmem range.

Ashmem range structure holds list of unpinned pages of all shared memory regions.
struct ashmem_range {
        struct list_head lru;
        struct list_head unpinned;
        struct ashmem_area *asma;
        size_t pgstart;
        size_t pgend;
        unsigned int purged;
};


Ashmem Usage:-

ASHMEM is allocated and used as follows:

fd = ashmem_create_region("my_shm_region", size);
if(fd < 0)
  return -1;
// Use fd to mmap from offset "0" to size mentioned below,
data = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if(data == MAP_FAILED)
  goto out;

Through binder this fd is shared, not the "my_shm_region"

PMEM Usage:-

char *pmaddr = pmem_map(fd, size))

PMEM deals with physical memory, but to maintain the ref count in
"struct file", pmem has to link/maintain both phy addr & fd.

Thats why pmem can't maintain refcount.

The process which creates shared memory through pmem should hold file descriptor untill all the refernces are closed. 

Ashmem, maintains a ref-counted object for each FD (shared region), it represents how many process are currently accessing shared memory region. 
If reference count is zero, then no process accessing that shared memory region.



3. Some of the functions available in android shared memory
(a) ashmem create region(“myfile”,size) this function creates            shared memory file in /dev/ashmem/myfile 

(b) ashmem set prot region(fd,prot) It helps to set protection           mechanism on shared memory. 

(c) ashmem pin region(fd, offset, length) this helps to process         protect the pages from kernel to reclaim in low memory. 

(d) ashmem unpin region(fd, offset, length) this function helps tell     to android shared memory subsystem to reclaim these pages in low     memory scenarios. A process can request for pining on already       unpinned pages to android shared memory sub system.

Thursday, August 25, 2016

Android App to HAL Code Flow

===================== GPS :: Flow from FW to HAL ===========================
App ==> FW java ==> JNI ==> HAL ==> kernel

Framework Java:-

/frameworks/base/services/core/java/com/android/server/location/GpsLocationProvider.java

Code:-
public class GpsLocationProvider implements LocationProviderInterface {
private native void native_inject_time(long time, long timeReference, int uncertainty);
}

*********************************************************************************
JNI Layer:-
/frameworks/base/services/core/jni/com_android_server_location_GpsLocationProvider.cpp
Code:-
/* Native method registration Start */
static JNINativeMethod sMethods[] = {
{"native_inject_time", "(JJI)V", (void*)android_location_GpsLocationProvider_inject_time},
......
}

Above method once registered java can call these methods,
int register_android_server_location_GpsLocationProvider(JNIEnv* env)
{
return jniRegisterNativeMethods(
env,
"com/android/server/location/GpsLocationProvider",
sMethods,
NELEM(sMethods));
}
/* Native method registration End */

/* Get the HAL module here start */
    err = hw_get_module(GPS_HARDWARE_MODULE_ID, (hw_module_t const**)&module);

/* Search the HAl module with the below macro name */
/hardware/libhardware/include/hardware/gps.h
#define GPS_HARDWARE_MODULE_ID "gps"

/* Call HAL module device open */
err = module->methods->open(module, GPS_HARDWARE_MODULE_ID, &device);

/* The below open method will get called */
loc_api/libloc_api_50001/gps.c
static struct hw_module_methods_t gps_module_methods = {
.open = open_gps
};
gps_device_t* gps_device = (gps_device_t *)device;
    sGpsInterface = gps_device->get_gps_interface(gps_device);

*********************************************************************************
HAL Layer:-

[1]
/hardware/qcom/gps/msm8084/loc_api/libloc_api_50001/loc.cpp
extern "C" const GpsInterface* get_gps_interface() {

return &sLocEngInterface; // See below the
}

// Defines the GpsInterface in gps.h
static const GpsInterface sLocEngInterface =
{
   sizeof(GpsInterface),
   loc_init,
   loc_start,
   loc_stop,
   loc_cleanup,
   loc_inject_time,
   loc_inject_location,
   loc_delete_aiding_data,
   loc_set_position_mode,
   loc_get_extension
};
The above structure is defined in hardware/libhardware/include/hardware/gps.h

Now when u call sGpsInterface->init(&sGpsCallbacks) from JNI above loc_init() fn will get called

========================================================================

Friday, August 5, 2016

I.MX socinitilization & how platform data is populated in platform device data from DT

Pre-condition:- All data are passed from DT.


Initial Flow:-
(kernel_init) from [<8000e4d8>] (ret_from_fork+0x14/0x20)
(kernel_init_freeable) from [<803ffb1c>] (kernel_init+0x18/0xf4)
(do_one_initcall) from [<8056cc04>] (kernel_init_freeable+0x104/0x1d0)
(customize_machine) from [<80008904>] (do_one_initcall+0xa4/0x118)
(imx6q_init_machine) from [<8056d938>] (customize_machine+0x24/0x48)
(of_platform_populate) from [<80577604>] (imx6q_init_machine+0x64/0x27c)
(of_platform_bus_create) from [<80361c2c>] (of_platform_populate+0x70/0xa4)
(of_platform_device_create_pdata) from [<80361b3c>] (of_platform_bus_create+0x100/0x180)
(of_device_alloc) from [<803619e0>] (of_platform_device_create_pdata+0x40/0x9c)
(of_device_make_bus_id) from [<8036198c>] (of_device_alloc+0x130/0x144)

We will see what is done in imx6 machine init (static void __init imx6q_init_machine(void)) step by step,

[1]
Check the CPU type, revision & print it,
          Code:-
                    if (cpu_is_imx6q() && imx_get_soc_revision() == IMX_CHIP_REVISION_2_0)


[2]
Then comes soc device init (parent = imx_soc_device_init();) Here parent device is soc.

    [A] Allcoate memory for soc device attribute,
           Code:-
                     soc_dev_attr = kzalloc(sizeof(*soc_dev_attr), GFP_KERNEL);

    [B] Read the model property from DT & store it in SOC machine type.
          Code:-
                     ret = of_property_read_string(root, "model", &soc_dev_attr->machine);
          DT Code:-
                           model = "Freescale i.MX6 Solo/Dual Lite Rochester Orinoco Board";
                           compatible = "fsl,imx6dl-sabresd", "fsl,imx6dl";
   
    [C] Assign the SOC ID, revision in device attribute. 

    [D] Register the device with the system. Kobject (sys entry) will be created. As part of device                    register, the below device vailables are initialized.
          Code:-
                    soc_dev->attr = soc_dev_attr;
                    soc_dev->dev.bus = &soc_bus_type;
                    soc_dev->dev.groups = soc_attr_groups;
                    soc_dev->dev.release = soc_release;
                    dev_set_name(&soc_dev->dev, "soc%d", soc_dev->soc_dev_num);


[3] Then read all the platform devices from DT & populate it in platform device data.
       Code:-
                 of_platform_populate(NULL, of_default_bus_match_table, NULL, parent);


of_platform_populate() - Populate platform_devices from device tree data
 * @root: parent of the first level to probe or NULL for the root of the tree
 * @matches: match table, NULL to use the default
 * @lookup: auxdata table for matching id and platform_data with device nodes
 * @parent: parent to hook devices from, NULL for toplevel
 *
 * Similar to of_platform_bus_probe(), this function walks the device tree
 * and creates devices from nodes.  It differs in that it follows the modern
 * convention of requiring all device nodes to have a 'compatible' property,
 * and it is suitable for creating devices which are children of the root
 * node (of_platform_bus_probe will only create children of the root which
 * are selected by the @matches argument).


During particular driver probe these platform data will be passed as a parameter.

For example touchscreen probe:-
Code:-
static int mxt_probe(struct i2c_client *client, const struct i2c_device_id *id)

The client data will be populated from below DT, as part of of_platform_populate.

DT Code:-
       atmel_mxt_ts@4b {
                compatible = "atmel,atmel_mxt_ts";
                reg = <0x4b>;
                gpio_mxt449t_rst = <&gpio1 23 0>;
                interrupt-parent = <&gpio1>;
                interrupts = <24 0>;
                gpio_intr = <&gpio1 24 0>;
                status = "okay";
        };




Thursday, February 18, 2016

ARM Cache Basic Architecture







Problem with direct mapping is if 0x00, 0x40 & 0x80 are repeatedly used, then its waste of storing it in the cache. To solve this issue set associative has come.



Here address 0x00 will be there in either "cache way 0" or "cache way 1" but not both. Here index is used to select a particular set(Refer 1st figure).

Real Example:-
Cache Size = 32KB(32768 bytes)
4 Ways associative
So,

No .of ways = 32KB/4 = 8KB ways
No. of bytes in each line = 32

No. of line = 8KB/32 = 256lines. So we need 8bits to represent a line.



Cache controller:-

  • A HW block that has the task of managing the cache memory, in a way that is (largely) invisible to the program.
  • It takes read and write memory requests from the core and performs the necessary actions to the cache memory or the external memory.
  • When it receives a request from the core it must check to see whether the requested address is to be found in the cache. This is known as a cache look-up.
  • If requested INSTR/data are not there in cache, then it is a cache miss. Then request passed to L2.
  • Once data/instr is found Cache linefill will happen.





Virtual and physical tags and indexes:-


VIVT ==> Bad
Pr will use VA to provide both the index and tag values.

Advantage:-
Core can do a cache look-up without the need for a VA to PA translation. 

Disadvantage:-
The drawback is that changing the virtual to physical mappings in the system, means that the cache must first be cleaned and invalidated, and this leads to significant performance impact.

VIPT ==> Good (ARM follows this)

Advantage:- 
Physical tagging scheme is that changes in virtual to physical mappings do not now require the cache to be invalidated. Bcoz still cache can hold same PA, VA may vary.

Using a virtual index, cache hardware can read the tag value from
the appropriate line in each way in parallel without actually performing the virtual to physical address translation, giving a fast cache response.

PIPT:-
For a 4-way set associative 32KB or 64KB cache, bits [12] and [13] of the address are required to select the index.

If 4KB pages are used in the MMU, bits [13:12] of the virtual address might not be equal to bits [13:12] of the physical address.

Solution is PIPT.



Cache policies:-

Allocation policy
Read allocate ==> Policy allocates a cache line only on a read.

write allocate ==> Policy allocates a cache line for either a read or write that misses in the cache.

Replacement policy
Round-robin
Pseudo-random
Least Recently Used (LRU)

Write policy

Write-through ==> Cache & main memory are coherent.
Write back ==> writes are performed only to the cache, and not to main memory.

Interesting Cache Issues during DMA Operation:-
https://lwn.net/Articles/2265/
https://lwn.net/Articles/2266/


Wednesday, January 13, 2016

Linux Mutex Internals

http://linuxkernelarticles.blogspot.in/2013/02/mutex-implementation-in-arm-architecture.html

Refer above blog for mutex implementation in ARM architecture. Here we will see the rest,

Basic Condition of Mutex:-
1) Only one task can hold the mutex at a time.
      struct mutex {
         ....
                     atomic_t count; //1: unlocked, 0: locked, negative: locked, possible waiters 
         ....
  }
2) Only the owner can unlock the mutex.
      struct mutex {
         struct task_struct *owner;
   }
3) Multiple unlocks are not permitted,
4) Recursive locking is not permitted,
5) A mutex object must be initialized via the API(mutex_init),
6) A mutex object must not be initialized via memset or copying,
7) Task may not exit with mutex held,
8) Memory areas where held locks reside must not be freed,
9) Held mutexes must not be reinitialized,
10) Mutexes may not be used in hardware or software interrupt contexts such as tasklets and timers.

How to Wakeup Other Process waiting for this mutex ?
1) All process waiting for this mutex lock, will be in struct mutex_waiter.list.
2) Get the 1st entry from the waist list & call wake_up_process(waiter->task);  {

           // Wakeup specific process
          p->state = TASK_WAKING;
          ttwu_queue(p, cpu); {

                    ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags) {
                               ttwu_activate(rq, p, ENQUEUE_WAKEUP | ENQUEUE_WAKING); {
                                     activate_task(rq, p, en_flags); // Enqueue the task in run queue
                                     p->on_rq = 1;
                               }
                   }
                   ttwu_do_wakeup(rq, p, wake_flags); {
                              p->state = TASK_RUNNING;
                   }

         }
}

Common Questions:-
1) Why spinlock variable(spinlock_t wait_lock;) is required in mutex structure?
     To avoid concurrent access to the mutex structure.

2) After holding the mutex that process can be preempted/process can sleep ?
     Yes. Holding mutex process can sleep.

3) Mutex held by process-1 executing in CPU-0, process-2 want the same mutex executing in CPU-1. How mutex unlock will wakeup process-2 ?
When kernel try to wakeup waiting process-2, first it will get CPU on which the process is executing

try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags) {
                    cpu = select_task_rq(p, p->wake_cpu, SD_BALANCE_WAKE, wake_flags);
                    ttwu_queue(p, cpu);
}

4) What is the difference b/w mutex, semaphore & spinlock ?