Tuesday, 16 April 2013

part 5 of (usefull post must see) Everything about overclocking kernels governors, i/o schedulers e.t.c


5. DUAL CORE CPU Q&A and TWEAKS

Q&A on parameters and factors that control the performance, throughput and battery-life delivered by GS2's dual-core CPU, and some CPU tweaks

Q. "What is the basic hardware of GS2 in deed our own canvas 2 and canvas HD that make all of us enjoy this phone so much and boast about benchmark scores to office-mates and friends?"
A. 
Processor: ARM Cortex-A9 MPCore processor on Exynos 4210 SoC (System on a Chip - ICs where all components are integrated into a single chip) and 45nm semi-conductor technology. Exynos 4210 is supposed to give 6.4GB/s memory bandwidth for heavy-weight ops such as full hd video encoding.
GPU: ARM Mali-400
Memory: LPDDR2 (may be DDR3)

Q. "What is the significance of bus frequency?"
A. Bus speed at its simplest form determines how fast the data should travel to and from memory. Memory throughput is directly proportional to bus frequency. In tasks that includes small amount of work on every element in a data sets, lower bus speed means longer the CPU has to wait for data to arrive from memory. Because, CPU spends only little time on each of these elements, and a slow bus cannot catch-up.
Advanced Micro-controller Bus Architecture (AMBA) is used as the on-chip bus in system-on-a-chip designs, like our device.

Q. "What is modifying bus frequency? How do I do it? Advantages?"
A. Stock behavior is dynamic bus frequency scaling, where in operating bus speed is dynamically calculated for each CPU frequency depending on the application/process’s requirement. We can modify this behavior by setting static bus frequency scaling, specifying at what bus speed should each CPU frequency operate. Three values/levels are possible.
0 – 400 mhz
1 – 266 mhz
2 – 133 mhz

Sample bus frequency modification:
echo "0 0 0 1 1 1 2 2" > /sys/devices/system/cpu/cpu0/cpufreq/busfreq_static
echo "enabled" > /sys/devices/system/cpu/cpu0/cpufreq/busfreq_static

This means for first three higher CPU frequency steps, 400 mhz bus will be used.
Next three, 266 mhz
And last two, 133 mhz

Advantages of bus frequency modification: i) Saves battery by using low bus speeds on low frequencies and ii) Prevent overheating.

Q. "I experience some lags sometimes while playing HD videos or playing heavy 3d games using static bus frequencies. Why?"
A. HD videos and some games require a minimum of 400/266 mhz bus irrespective of the CPU frequencies being used during the run. To resolve, set higer bus for 500 mhz and higher frequencies or simply disable static bus frequency scaling to switch to default.
echo "disabled" > /sys/devices/system/cpu/cpu0/cpufreq/busfreq_static

Q. "Our phone CPU has two cores. How are they utilized? Are the two cores ON all the time?"
A. The stock behavior is Dynamic Hot Plug Mode where depending on the load, the second core is turned on. If the load can be handled by a single core, the second core is turned off dynamically. This behavior can be controlled by using Tegrak Second Core app from market if your kernel supports it. (Siyah, Lulz,etc supports this). Using this app you can set three modes :-
Dynamic Hot Plug Mode: Default mode. Second core is kicked in depending on the load, and kicked out when first core can handle the load alone.
Single Core Mode: Irrespective of the load, only first core is used always. This can lead to increased battery, but reduced performance.
Dual Core Mode: Irrespective of low loads, both the cores are always active. Increased performance, but reduced battery.

Recommendation: Use the stock hotplug mode during normal use. Switch to dual core mode only for benchmarking or playing some heavy 3d games.

Q. "OK, I'm using hot plug mode, still i want to control how often the second core kicks in. To make it more aggressive/more mild depending on my usage."
A. You can set UP & LOW thresholds for second core in Screen-On and Screen-Off states.
Examples:
echo "70" > /sys/module/pm_hotplug/parameters/loadh
echo "25" > /sys/module/pm_hotplug/parameters/loadl
echo "90" > /sys/module/pm_hotplug/parameters/loadh_scroff
echo "35" > /sys/module/pm_hotplug/parameters/loadl_scroff

As you can see, when load > 70% second core becomes active and when load drops below 25%, second core is turned off.
During screen off, these values are 90 & 35 respectively. This helps in reducing unwanted kick-ins of second during screen-off state when music is playing, downloading, etc.

Q. "Like governors, is there a sampling rate/interval also at which the load on CPU is checked for crossing thresholds to turn second core ON?"
A. Yes there is. But it is set at kernel level in most kernels and can not be controlled at user level. Like you guessed, higher sampling rate could cause core 2 to kick in less often and thus save a little battery. In Siyah kernel though, these thresholds are configurable.

Q. "Advantages/Disadvantages of switching to Single Core/Dual Core modes?
A. Using only single core can save some battery, but can have some adverse effects too if there are some heavy tasks that require both cores too often: 3d games, full hd videos, etc. So use it wisely.
Using dual core mode can reduce latency by a tiny bit on high loads, as compared to hot plugging. But hot plugging is intelligent enough to turn second core ON really fast when load demands it. Only first core (cpu0) can enter deep-idle (LPA), so using dual core mode in an idle system cause unwanted excess-power consumption.
Recommendation: Use Hot Plugging and tune thresholds (like mentioned above) for a better experience.

Q. "What are these modes: IDLE, LPA and AFTR?"
A. Between screen off and deep sleep states, there are some idle modes supported by cpuidle driver. They are IDLE aka Normal Idle, LPA aka Deep Idle and AFTR aka ARM Off Top Running. Race to idle by CPU is implemented for power management. 

In IDLE state, CPU is not clocked anymore, but no hardware is powered down.

In deep idle (LPA),a state after IDLE, again, the cpu is not clocked anymore like we guessed but some parts of hardware are powered down. Deep idle brings in real power savings and there is no need of putting a hard limit to frequency during screen-off; using a screen-off profile. (Good practice is to use a governor with built in screen off profile, than using an user-configured screen-off profile by putting a hard limit on frequency). Deep idle is not used when device is entering deep sleep and also when device is woken from suspend/deep sleep. While entering/exiting DEEP IDLE, CPU is set statically to SLEEP_FREQ and is not clocked below or above until it exits this state.

AFTR is a patch to support Top=Off mode for deep idle. Level 2 cache keeps it data during this mode.

We can have IDLE or AFTR modes with LPA enabled or disabled. (Obviously it is not possible to have IDLE and AFTR together)
Values:
0: IDLE
1: AFTR
2: IDLE+LPA
3: AFTR+LPA

Q. "What idle modes are recommended for power saving? How do i change it"?
A. Recommended for power saving is to enable AFTR and LPA, ie value 3
Example:
echo "3" > /sys/module/cpuidle/parameters/enable_mask

Q. "What is sched_mc?"
A. Linaro team invented sched_mc or Schedule Multi Core to make process scheduling multi-core aware. ie, utilize both cores wisely to save power and balance performance. Even though sched_mc is sort of an alternative to cpu hot plugging, we can use sched_mc with the default hot plug mode.

Possible Values:
0 : No power saving load balance, default in our exynos4210 Soc.
1 : Fill one thread/core/package first for long running threads. In our single-CPU dual-core device, multithreading does not come into picture, so load balancing is almost redundant to hotplugging.
2 : Also bias task wake-ups to semi-idle CPU package for power savings. (Bias new tasks to cpu1 if cpu0 is mostly filled with running tasks). This is 'overloading' CPU0 first.

Q. "What value is recommended for sched_mc?"
A. 1) If you find advantages to sched_mc, use sched_mc=1 for a possible battery saving. Anyhow since load-balancing is reduntant on hotplugging, it may not have any advantage on exynos chip. 
2) For performance use 2. But do remember that loading CPU0 and leaving CPU1 can not do justice to hitting deep idle states sooner since second core can not enter deep idle. So extra performance or no performance, value 2 will drain some more battery, in the context of delayed didle.
3) To do justice to hotplugging, use value 0.
Example:
echo "0" /sys/devices/system/cpu/sched_mc_power_savings.

Q. "What is MALI aggressive policy on GPU?"
A. Mali aggressive scaling policy is simply lowering the up-threshold of GPU so that GPU doesn't jump to second frequency step too often. This makes more sense if lower step is under-clocked. In one release of Siyah, the threshold was changed to 55 from default 65.

Q. "What is tree rcu, fast nohz, jrcu?"
A. Read-Copy Update (RCU) is a synchronization mechanism added to Linux kernel. RCU improves scalability by allowing readers to execute concurrently with writers. 

Tree RCU is a new implementation of original classic RCU to achieve more scalability as the number of CPUs increase. Tree RCU fixes a performance bug in classic RCU that results in massive lock contention on the internal RCU lock on systems with large number of CPUs.

Fast NoHz is an optimized version of the traditional Tree RCU. Many new kernels are using the Tickless NoHz design. This RCU is tailored and designed to work with the new NoHz kernel system.

JRCU mechanism in its simplest form, runs batch operations from a single CPU relieving other CPUs from this periodic responsibility. This is important for those real-time applications requiring full use of dedicated CPUs. For our dual core single CPU, JRCU can conflict with hot-plugging, hence we will have tree rcu (with or without CONFIG_RCU_FAST_NO_HZ) in our kernels.

Q. "What are SLAB, SLUB, SLQB?"
A. They're three memory allocation mechanisms.

Slab allocation is a memory management mechanism intended for the efficient memory allocation of kernel objects which displays the desirable property of eliminating fragmentation caused by allocations and de-allocations. SLAB is used to retain allocated memory that contains a data object of a certain type for reuse upon subsequent allocations of objects of the same type.

SLUB allocator promises better performance and scalability by dropping most of the queues and related overhead and simplifying the slab structure in general, while retaining the current slab allocator interface. SLUB offers to make alignment of objects and cleaning up of caches easier, as compared to SLAB.

SLQB - SLAB allocator with Queue. This is a slab allocator that focuses on per-CPU scaling. This memory allocator is designed for small number of CPUs system. This allocator is designed to be simple.

Note that SLUB is significant on a system with large number of CPUs. SLAB has the advantage of being simple.

Q. "Can i change the RCU synchronization mechanism & memory allocators?"
A. NO. They are set at compile time at kernel level, and are not configurable from user space.



MISC Q&A

Q. "What is top-off current?"
A. Charge cycle for the device's battery actually consist of two stage. 
First stage consist of supplying a constant current until battery reaches it's constant/peak voltage, something between 4.1 and 4.2 v. 
Upon reaching this peak voltage, a constant voltage is applied until the charge current goes below top-off current. This is the second stage. Stock top-off current is 200ma. From Siyah 2.6.9, it is set to 100ma just so that a little more juice goes into battery since a lower top-off current means longer the constant voltage is applied in the second phase of charging. 
If you love your battery, do not charge to 'real' 100% too often. Perform the 'trickle' charge only once every 20 days or so.

Q. "My battery drains fast sometimes immediately after a kernel flash. It's like this: i reboot the device with 40 percent battery left and when it returns, i have only 20 percent left. Anything i can do?"
A. Your battery is not actually draining fast. But the fuel gauge is showing funny values which is not the real percentage left. On high-loads, like immediately after you reboot cause the fuel gauge to report low percentages. What you can do is to reset the fuel gauge.
[Courtesy Entropy512. The code is for i9100. Location of reset-file may be different in other variants of GS2]
Give it a few hours after you reset the gauge. It may still show you funny values for those period, then the battery percentage should be fine.
Code:
echo "1" > /sys/devices/platform/i2c-gpio.9/i2c-9/9-0036/power_supply/fuelgauge/fg_reset_soc
Q. "So CPU/GPU or GPS chip, which is the biggest power drainer in GS2?"
A. It is the bright amoled display! Display uses roughly 370mW average and 960mW with 100% brightness full white screen. Avoid bright wallpapers, reduce brightness.

Q. "What're the approximate power consumptions by the device peripherals & activities?"
A.
  • AMOLED Display: Average - 370mW. Full white background, 1% brightness - 450mW. Full white background, 100% brightness - 960mW. So roughly every percentage of brightness increased accounts to additional 5.2mW. (Now we know why using dark wallpapers and reducing brightness is so important than undervolting).
  • Illuminated button - 40mW
  • Led lamp next to camera - 1.3W
  • Camera - 700mW
  • Bluetooth and GPS - 110 to 180mW (Really?!)
  • 2G to 3G switching - 800mW for 8 seconds. (This is no h/w component, but we should know)
  • CPU 1.4 Ghz full load, 100% brightness - 4W+
  • CPU 1.4 Ghz average - 3.2W
  • CPU 1.6 Ghz full load - 5.9W (Forget Ocing to 1600mhz)
  • BLN - 200mW during suspend state opposed to deep sleep 8mW without BLN.
  • Wifi download - 1.51W
  • 2G download - 1.598W
  • 2G upload - 853mW
  • 3G download - 1.603W
  • 3G upload - 2.136W (Stay away from uploading your videos to youtube via 3G)

Q. "Sometimes the device says 'low battery' and switches itself off. But when i turn it on, there's 30% left. Why?"
A. Some heavy load conditions such as quickly reaching 1600mhz on full load, etc will cause the battery voltage to time below 3.3V and this is wrongly interpreted by the battery as empty.

Q. "What is 500 mhz core voltage bug?"
A.  It's not a bug. It's a safety feature. What is it: When frequency is raised to 500 from a frequency below it, core voltage used for 500mhz is the core voltage of 800mhz. When frequency is dropped to 500 from a frequency above it, core voltage used is it's own voltage. So climb to 500 uses 800's volt and fall to 500 uses it's own volt. If you're UVing do it properly for 500 and 800. Now you know why.





SIYAH SPECIFIC TWEAKS (2.6 gingerbread versions)

Summary of all user configurable parameters in Siyah kernel. Some which were already listed in above posts, and some which i may have missed out. Let's have everything in one place, with examples.
1) CPU Frequency & Voltages
#Set frequency steps according to the number of steps in your kernel.
echo "1600 1400 1200 1000 800 500 200 100" > /sys/devices/system/cpu/cpu0/cpufreq/freq_table
#Set voltages for frequency steps. Changes possible at +/-25mV steps
echo "1425 1325 1275 1175 1075 975 950 950" > /sys/devices/system/cpu/cpu0/cpufreq/UV_mV_table
#Sets global scaling min&max frequencies
echo "200000" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq
echo "1400000" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
2) Scaling Governor & Smooth Scaling Parameters
#Set scaling governor, according to available governors in your kernel
echo ondemandx > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
#Smooth scaling parameters to control any governor jumping to higher frequency directly (other governor specific tweaks in first post).
echo "2" > /sys/devices/system/cpu/cpu0/cpufreq/smooth_target
echo "2" > /sys/devices/system/cpu/cpu0/cpufreq/smooth_offset
echo "2" > /sys/devices/system/cpu/cpu0/cpufreq/smooth_step


note: Smooth scaling is disabled for interactive based governors: Interactive, Interactivex and Lulzactive in Siyah. Idle loop based governors shouldn't like throttling.
When CPU is on a certain frequency (let's call this current_freq) and governor decides to jump CPU up to a higher frequency (let's call this target_level), Then
If target_level less than smooth_target, CPU jumps either to smooth_target+smooth_offset or current_freq-smooth_step, whichever is smaller.
Note that L0=1600 mhz, L1=1400 mhz, L2=1200 mhz, L3=1000 mhz, ..., L7=100 mhz

Example:
CPU current_freq is 500 (L5) and Ondemand governor decides to jump to 1400 (L1).
We have smooth_target = 2 = L2, smooth_offset = 2 and smooth_step = 2
smooth_target + smooth_offset = L2+2 = L4 = 800 mhz
current_freq - smooth_step = L5-2 = L3 = 1000 mhz
Since 800 mhz is smaller CPU jumps to 800 mhz first and then 1400 mhz.
3) GPU Clock, Voltages, Thresholds & Staycounts
#Set GPU clocks ( valid values are 400/(x*0.5) where x is an integer >= 2. So valid values will be 400/1,400/1.5,etc. Examples: 40 80 89 100 114 133 160 200 267 400 )
echo "160 200 267" > /sys/class/misc/gpu_clock_control/gpu_control
#Set GPU voltages (changes possible at +/-50mV ie at 50000 steps)
echo "900000 950000 1000000" > /sys/class/misc/gpu_voltage_control/gpu_control
#Set GPU Up and Down thresholds
echo "85% 55% 85% 50%" > /sys/class/misc/gpu_clock_control/gpu_control
Working of Thresholds:
Up threshold for Step 1 (160 mhz) = 85% [GPU scales up to 200 from 160 when load >= 85%]
Down Threshold for Step 2 (200 mhz) = 55% [GPU scales down to 160 from 200 when load < 55%]
Up Threshold for Step 2 (200 mhz) = 85% [GPU scales up to 267 from 200 when load >= 85%]
Down Threshold for Step 3 (267 mhz) = 50% [GPU scales down to 200 from 267 when load < 50%]
Step 1 will not have a Down Threshold & Step 3 will not have an Up Threshold since they don't have a step to scale-down to or scale-up to.
#Set GPU Staycounts. Staycount act as rate multiplier for GPU sampling intervals. Now you have complete control over GPU!
echo "1 1 1" > /sys/class/misc/gpu_control/gpu_staycount
4) Hot Plug Thresholds, Sampling Interval & Frequency
#Set second core kick-in threshold for screen-on state
echo "25" > /sys/module/pm_hotplug/parameters/loadl
echo "70" > /sys/module/pm_hotplug/parameters/loadh
#Set second core kick-in threshold for screen-off state [Forcing second core NOT to turn on during screen-off make it easier for first core to hit deep idle, hence power savings]
echo "35" > /sys/module/pm_hotplug/parameters/loadl_scroff
echo "100" > /sys/module/pm_hotplug/parameters/loadh_scroff
#Set hot plug sampling intervals for screen-on state
echo "200" > /sys/module/pm_hotplug/parameters/rate
echo "400" > /sys/module/pm_hotplug/parameters/rate_cpuon

rate is the sampling interval to check if second core should be kicked-in, if present load >= loadh.
rate_cpuon is the sampling ineterval to check if second core should be turned off (if already online), if present load < loadl
#Set hot plug sampling intervals for screen-off state
echo "800" > /sys/module/pm_hotplug/parameters/rate_scroff
rate_scroff is the sampling interval used in screen-off state to check if second core should be turned on, if current load >= loadh_scroff
If second core is already online, rate_cpuon is used as the sampling to check if second core should be turned off

For more info on Hotplug sampling and behavior, please see this post. Unit for these sampling intervals are jiffies. Since frequency of GS2 system timer = 200hz, divide jiffy value by 200 to convert into seconds.
#Set frequency below which second core will not be turned on, regardless of thresholds.
echo "500000" > /sys/module/pm_hotplug/parameters/freq_cpu1on
If CPU frequency <= 500 mhz, then second will not be turned on.
5) Deepsleep Levels
#Set deep sleep frequency & bus speed (L4=800 mhz and 0=400mhz bus speed)
echo "4" > /sys/devices/system/cpu/cpu0/cpufreq/deepsleep_cpulevel
echo "0" > /sys/devices/system/cpu/cpu0/cpufreq/deepsleep_buslevel
6) I/O Schedulers
#Set i/o scheduler
echo "sio" > /sys/block/mmcblk0/queue/scheduler
7) Bus Frequencies
#Set bus frequencies for highest-to-lowest CPU frequencies and enable static bus frequency scaling
echo "0 0 0 1 1 2 2 2" > /sys/devices/system/cpu/cpu0/cpufreq/busfreq_static
echo "enabled" > /sys/devices/system/cpu/cpu0/cpufreq/busfreq_static

Bus speeds: 0: 400 mhz | 1: 266 mhz | 2: 133 mhz
8) Schedule Multi Core & Idle Modes
#enable sched_mc
echo "1" > /sys/devices/system/cpu/sched_mc_power_savings
#enable AFTR
echo "3" > /sys/module/cpuidle/parameters/enable_mask
9) Touch Sensitivity Parameters
#touch sensitivity
echo "50" > /sys/devices/virtual/sec/sec_touchscreen/tsp_threshold
Possible values are between 40 to 80. Lower value = higher sensitivity.
Also use Tegrak's Touch Move app from market to further control touch sensitivity
10) Charge Current
#set AC, Misc & USB charge current
echo "750 650 450" > /sys/devices/virtual/misc/charge_current/charge_current
AC refers to wall charger current, MISC refers to car charger current , USB refers to usb charge current from pc. Do not set Ac & Misc more than 1000mA or Usb more than 450.

11) Brightness Curve Settings
#brightness settings
echo "30" > /sys/class/misc/brightness_curve/min_bl
echo "1" > /sys/class/misc/brightness_curve/min_gamma
echo "24" > /sys/class/misc/brightness_curve/max_gamma

We will have lowest brightness or zero gamma for brightness level read from sensor < 30. Above that, it is linearly mapped to [min_gamma:max_gamma] which is [1:24] here.
To increase the minimum brightness, decrease the min_bl.
Possible values for min_bl = 0 to 255 | min_gamma = 0 to 24 | max_gamma = 0 to 24

12) Switch Hotplug/DualCore/SingleCore
#Dynamic hotplug mode
echo "on" > /sys/devices/virtual/misc/second_core/hotplug_on
#Single core mode
echo "off" > /sys/devices/virtual/misc/second_core/hotplug_on
echo "off" > /sys/devices/virtual/misc/second_core/second_core_on
#Dual core mode
echo "off" > /sys/devices/virtual/misc/second_core/hotplug_on
echo "on" > /sys/devices/virtual/misc/second_core/second_core_on
The above script is a replacement for Tegrak's 2nd Core app, for those who don't like apps to set something on boot.

No comments:

Post a Comment