Python GILïŒã°ããŒãã«ã€ã³ã¿ãŒããªã¿ãŒããã¯ïŒã®ç®çïŒã€ãŸãCPythonïŒã説æããèšäºã¯ãããããããŸããã€ãŸããGILã¯ããã«ãã¹ã¬ããã®ã¯ãªãŒã³ãªPythonã³ãŒããè€æ°ã®ããã»ããµã³ã¢ã䜿çšããã®ãé²ããŸãã
ãã ããVaexã§ã¯ãGILãç¡å¹ã«ããC ++ã§æãèšç®éã®å€ãã¿ã¹ã¯ãå®è¡ããŸããããã¯ãPythonãé«ã¬ãã«ã®æ¥çå€ãšããŠã®ã¿æ©èœãã髿§èœPythonã©ã€ãã©ãªã®éåžžã®æ¹æ³ã§ãã
GILã¯æç€ºçã«ç¡å¹ã«ããå¿ èŠããããããã°ã©ããŒã®è²¬ä»»ã§ãããå¿ããããšãã§ããŸããããã¯ã容éã®éå¹ççãªäœ¿çšã«ã€ãªãããŸããæè¿ãç§èªèº«ã¯å¿ãã£ãœãã®åœ¹å²ã§ãã£ãããšã§åæ§ã®åé¡ãçºèŠãã Apacheã®ã¢ããŒïŒããã¯Vaexã®äŸåé¢ä¿ã§ãããããArrowã§GILãç¡å¹ã«ãªã£ãŠããªãå Žåãç§ãã¡ïŒããã³ä»ã®ãã¹ãŠã®äººïŒã¯ããã©ãŒãã³ã¹ã®äœäžãçµéšããŸãïŒã
ããã«ã64ã³ã¢ã§å®è¡ããŠããå ŽåãVaexã®ããã©ãŒãã³ã¹ãçæ³ããããé¢ããŠããããšããããŸãã6400ïŒ ã§ã¯ãªã4000ïŒ ã®CPUã䜿çšããŠããå¯èœæ§ããããŸãããããã¯ç§ã«ã¯é©ããŠããŸããããã®å¹æã調ã¹ãããã«ã¹ã€ãããã©ã³ãã ã«æ¿å ¥ãã代ããã«ãäœãèµ·ãã£ãŠããã®ããçè§£ããããšæããŸããåé¡ãGILã«ããå Žåã¯ãVaexãé ããªãçç±ã𿹿³ãçè§£ããããšæããŸãã
ãªãç§ã¯ãããæžããŠããã®ã§ãã
ãã€ãã£ãæ¡åŒµæ©èœã䜿çšããŠPythonããããã¡ã€ãªã³ã°ããã³ãã¬ãŒã¹ããããã®ããŒã«ãšææ³ã®ããã€ããããã³ãããã®ããŒã«ãçµã¿åãããŠGILãæå¹ãŸãã¯ç¡å¹ã«ããå Žåã®Pythonã®ããã©ãŒãã³ã¹ãåæããã³èŠèŠåããæ¹æ³ã«ã€ããŠèª¬æããäžé£ã®èšäºãäœæããäºå®ã§ãã
ããã«ãããèšèªãšã³ã·ã¹ãã ã§ã®ãã¬ãŒã¹ããããã¡ã€ãªã³ã°ããã®ä»ã®ããã©ãŒãã³ã¹æž¬å®ãããã³Pythonãšã³ã·ã¹ãã å šäœã®ããã©ãŒãã³ã¹ãåäžããããšãé¡ã£ãŠããŸãã
èŠä»¶
Linux
Linuxãã·ã³ãžã®rootã¢ã¯ã»ã¹ãå¿ èŠã§ãïŒsudoã§ååã§ãïŒããŸãã¯ãã·ã¹ãã 管çè ã«ä»¥äžã®ã³ãã³ããå®è¡ããããã«äŸé ŒããŠãã ããããã®èšäºã®æ®ãã®éšåã§ã¯ããŠãŒã¶ãŒç¹æš©ã§ååã§ãã
ããã©ãŒãã³ã¹
ããšãã°ãUbuntuã«perfãã€ã³ã¹ããŒã«ãããŠããããšã確èªããŠãã ãããæ¬¡ã®ããã«å®è¡ã§ããŸãã
$ sudo yum install perf
ã«ãŒãã«æ§æ
ãŠãŒã¶ãŒãšããŠå®è¡ïŒ
# Enable users to run perf (use at own risk)
$ sudo sysctl kernel.perf_event_paranoid=-1
# Enable users to see schedule trace events:
$ sudo mount -o remount,mode=755 /sys/kernel/debug
$ sudo mount -o remount,mode=755 /sys/kernel/debug/tracing
Pythonããã±ãŒãž
VizTracerãšper4mãäœ¿çš ã ãŸã
$ pip install "viztracer>=0.11.2" "per4m>=0.1,<0.2"
perfã䜿çšããŠã¹ã¬ãããšããã»ã¹ã®ç¶æ ã远跡ãã
ãã®ããã®APIããªããããPythonã§GILã®ç¶æ ãææ¡ããæ¹æ³ã¯ãããŸããïŒããŒãªã³ã°ã䜿çšããä»¥å€ ïŒãã«ãŒãã«ããç¶æ ãç£èŠã§ããŸãããã®ããã«ã¯ãperfããŒã«ãå¿ èŠã§ã ã
ãã®å©ãïŒperf_eventsãšããŠãç¥ãããŠããŸãïŒã䜿çšããŠãããã»ã¹ãšã¹ã¬ããã®ç¶æ ã®å€åããªãã¹ã³ãïŒã¹ãªãŒããšå®è¡ã®ã¿ã«é¢å¿ããããŸãïŒããããããã°ã«èšé²ã§ããŸããããã©ãŒãã³ã¹ã¯åšå§çã«èŠãããããããŸããããããã¯åŒ·åãªããŒã«ã§ããããã«ã€ããŠãã£ãšç¥ãããå Žåã¯ãJuliaEvansãŸã㯠BrendanGreggã®Webãµã€ãã®èšäºãèªãããšããå§ãã ãŸãã
調æŽããããã«ãæåã«ç°¡åãªããã°ã©ã ã«perfãé©çšããŸããã ïŒ
import time
from threading import Thread
def sleep_a_bit():
time.sleep(1)
def main():
t = Thread(target=sleep_a_bit)
t.start()
t.join()
main()
ãã€ãºãæžããããã«ãããã€ãã®ã€ãã³ãããªãã¹ã³ããŸãïŒã¯ã€ã«ãã«ãŒãã®äœ¿çšã«æ³šæããŠãã ããïŒã
$ perf record -e sched:sched_switch -e sched:sched_process_fork \
-e 'sched:sched_wak*' -- python -m per4m.example0
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0,032 MB perf.data (33 samples) ]
ãããŠãperf scriptã³ãã³ãã䜿çšããŠãè§£æã«é©ããèªã¿åãå¯èœãªçµæãââåºåããŸãã
é ãããã¹ã
$ perf script
:3040108 3040108 [032] 5563910.979408: sched:sched_waking: comm=perf pid=3040114 prio=120 target_cpu=031
:3040108 3040108 [032] 5563910.979431: sched:sched_wakeup: comm=perf pid=3040114 prio=120 target_cpu=031
python 3040114 [031] 5563910.995616: sched:sched_waking: comm=kworker/31:1 pid=2502104 prio=120 target_cpu=031
python 3040114 [031] 5563910.995618: sched:sched_wakeup: comm=kworker/31:1 pid=2502104 prio=120 target_cpu=031
python 3040114 [031] 5563910.995621: sched:sched_waking: comm=ksoftirqd/31 pid=198 prio=120 target_cpu=031
python 3040114 [031] 5563910.995622: sched:sched_wakeup: comm=ksoftirqd/31 pid=198 prio=120 target_cpu=031
python 3040114 [031] 5563910.995624: sched:sched_switch: prev_comm=python prev_pid=3040114 prev_prio=120 prev_state=R+ ==> next_comm=kworker/31:1 next_pid=2502104 next_prio=120
python 3040114 [031] 5563911.003612: sched:sched_waking: comm=kworker/32:1 pid=2467833 prio=120 target_cpu=032
python 3040114 [031] 5563911.003614: sched:sched_wakeup: comm=kworker/32:1 pid=2467833 prio=120 target_cpu=032
python 3040114 [031] 5563911.083609: sched:sched_waking: comm=ksoftirqd/31 pid=198 prio=120 target_cpu=031
python 3040114 [031] 5563911.083612: sched:sched_wakeup: comm=ksoftirqd/31 pid=198 prio=120 target_cpu=031
python 3040114 [031] 5563911.083613: sched:sched_switch: prev_comm=python prev_pid=3040114 prev_prio=120 prev_state=R ==> next_comm=ksoftirqd/31 next_pid=198 next_prio=120
python 3040114 [031] 5563911.108984: sched:sched_waking: comm=node pid=2446812 prio=120 target_cpu=045
python 3040114 [031] 5563911.109059: sched:sched_waking: comm=node pid=2446812 prio=120 target_cpu=045
python 3040114 [031] 5563911.112250: sched:sched_process_fork: comm=python pid=3040114 child_comm=python child_pid=3040116
python 3040114 [031] 5563911.112260: sched:sched_wakeup_new: comm=python pid=3040116 prio=120 target_cpu=037
python 3040114 [031] 5563911.112262: sched:sched_wakeup_new: comm=python pid=3040116 prio=120 target_cpu=037
python 3040114 [031] 5563911.112273: sched:sched_switch: prev_comm=python prev_pid=3040114 prev_prio=120 prev_state=S ==> next_comm=swapper/31 next_pid=0 next_prio=120
python 3040116 [037] 5563911.112418: sched:sched_waking: comm=python pid=3040114 prio=120 target_cpu=031
python 3040116 [037] 5563911.112450: sched:sched_waking: comm=python pid=3040114 prio=120 target_cpu=031
python 3040116 [037] 5563911.112473: sched:sched_wake_idle_without_ipi: cpu=31
swapper 0 [031] 5563911.112476: sched:sched_wakeup: comm=python pid=3040114 prio=120 target_cpu=031
python 3040114 [031] 5563911.112485: sched:sched_switch: prev_comm=python prev_pid=3040114 prev_prio=120 prev_state=S ==> next_comm=swapper/31 next_pid=0 next_prio=120
python 3040116 [037] 5563911.112485: sched:sched_waking: comm=python pid=3040114 prio=120 target_cpu=031
python 3040116 [037] 5563911.112489: sched:sched_waking: comm=python pid=3040114 prio=120 target_cpu=031
python 3040116 [037] 5563911.112496: sched:sched_switch: prev_comm=python prev_pid=3040116 prev_prio=120 prev_state=S ==> next_comm=swapper/37 next_pid=0 next_prio=120
swapper 0 [031] 5563911.112497: sched:sched_wakeup: comm=python pid=3040114 prio=120 target_cpu=031
python 3040114 [031] 5563911.112513: sched:sched_switch: prev_comm=python prev_pid=3040114 prev_prio=120 prev_state=S ==> next_comm=swapper/31 next_pid=0 next_prio=120
swapper 0 [037] 5563912.113490: sched:sched_waking: comm=python pid=3040116 prio=120 target_cpu=037
swapper 0 [037] 5563912.113529: sched:sched_wakeup: comm=python pid=3040116 prio=120 target_cpu=037
python 3040116 [037] 5563912.113595: sched:sched_waking: comm=python pid=3040114 prio=120 target_cpu=031
python 3040116 [037] 5563912.113620: sched:sched_waking: comm=python pid=3040114 prio=120 target_cpu=031
swapper 0 [031] 5563912.113697: sched:sched_wakeup: comm=python pid=3040114 prio=120 target_cpu=031
4çªç®ã®åïŒç§åäœã®æéïŒãèŠãŠãã ãããããã°ã©ã ãã¹ãªãŒãç¶æ ã«ãªããŸããïŒ1ç§ãçµéããŸããïŒãããã«ãã¹ãªãŒãç¶æ ãžã®å ¥ãå£ã衚瀺ãããŸãã
python 3040114 [031] 5563911.112513: sched:sched_switch: prev_comm=python prev_pid=3040114 prev_prio=120 prev_state=S ==> next_comm=swapper/31 next_pid=0 next_prio=120
ããã¯ãã«ãŒãã«ãPythonã¹ã¬ããã®ç¶æ ã
S
ïŒ=ã¹ãªãŒãïŒã«å€æŽããããšãæå³ããŸã ã
1ç§åŸãããã°ã©ã ã¯ç®èŠããŸããã
swapper 0 [031] 5563912.113697: sched:sched_wakeup: comm=python pid=3040114 prio=120 target_cpu=031
ãã¡ãããäœãèµ·ãã£ãŠããã®ããçè§£ããã«ã¯ãããŒã«ãåéããå¿ èŠããããŸããã¯ããçµæã¯per4mã䜿çšããŠç°¡åã«è§£æããããšãã§ããŸã ããç¶è¡ããåã«ãVizTracerã䜿çšããŠããå°ãè€éãªããã°ã©ã ã®ãããŒãèŠèŠåããããš æããŸãã
VizTracer
ããã¯ãããã°ã©ã ããã©ãŠã¶ãŒã§å®è¡ããŠããäœæ¥ãèŠèŠåã§ããPythonãã¬ãŒãµãŒã§ããããè€éãªããã°ã©ã ã«é©çšããŠã¿ãŸããã ïŒ
import threading
import time
def some_computation():
total = 0
for i in range(1_000_000):
total += i
return total
def main():
thread1 = threading.Thread(target=some_computation)
thread2 = threading.Thread(target=some_computation)
thread1.start()
thread2.start()
time.sleep(0.2)
for thread in [thread1, thread2]:
thread.join()
main()
ãã¬ãŒãµãŒæäœã®çµæïŒ
$ viztracer -o example1.html --ignore_frozen -m per4m.example1 Loading finish Saving report to /home/maartenbreddels/github/maartenbreddels/per4m/example1.html ... Dumping trace data to json, total entries: 94, estimated json file size: 11.0KiB Generating HTML report Report saved.
çµæã®HTMLã¯æ¬¡ã®ããã«ãªããŸãã
some_computation
GILãããã鲿¢ããŠããããšã¯ããã£ãŠããŸãã ã䞊è¡ããŠïŒ2åïŒå®è¡ãããããã§ããäœãèµ·ããŠãïŒ
VizTracerãšperfã®çµæãçµã¿åããã
example0.pyã®ããã«perfãé©çšããŠã¿ãŸããããããã§ãVizTracerãšåãæèš
-k CLOCK_MONOTONIC
ã䜿çšããåŒæ°ã 远å ããHTMLã®ä»£ããã«JSONãçæããããã«äŸé ŒããŸããã ã
$ perf record -e sched:sched_switch -e sched:sched_process_fork -e 'sched:sched_wak*' \
-k CLOCK_MONOTONIC -- viztracer -o viztracer1.json --ignore_frozen -m per4m.example1
次ã«ãper4mã䜿çšããŠãperfã¹ã¯ãªããã®çµæãVizTracerãèªã¿åããJSONã«å€æããŸãã
$ perf script | per4m perf2trace sched -o perf1.json
Wrote to perf1.json
ããã§ãVizTracerã䜿çšããŠã2ã€ã®JSONãã¡ã€ã«ãçµã¿åãããŠã¿ãŸãããã
$ viztracer --combine perf1.json viztracer1.json -o example1_state.html Saving report to /home/maartenbreddels/github/maartenbreddels/per4m/example1.html ... Dumping trace data to json, total entries: 131, estimated json file size: 15.4KiB Generating HTML report Report saved.
ãããç§ãã¡ãåŸããã®ã§ãïŒ
æããã«ãã¹ã¬ããã¯GILã®ããã«å®æçã«ã¹ãªãŒããã䞊è¡ããŠå®è¡ãããŸããã
泚ïŒã¹ãªãŒããã§ãŒãºã®é·ãã¯çŽ5ããªç§ã§ãããã¯ããã©ã«ãã®sys.getswitchintervalã«å¯Ÿå¿ããŸã ã
GILã®å®çŸ©
ããã»ã¹ã¯ã¹ãªãŒãç¶æ ã«ãªããŸãããåŒã³åºã
time.sleep
ãŸãã¯GILã«ãã£ãŠéå§ãããã¹ãªãŒãç¶æ ã®éãã¯ããããŸãã ãéããèŠåããæ¹æ³ã¯ããã€ããããŸããããã®ãã¡ã®2ã€ãèŠãŠã¿ãŸãããã
ã¹ã¿ãã¯ãã¬ãŒã¹ãä»ããŠ
å©ããåããŠ
perf record -g
ïŒãŸãã¯ãããè¯ãã®ã¯
perf record --call-graph dwarf
ããæå³ããŸã
-g
ïŒãåããã©ãŒãã³ã¹ã€ãã³ãã®ã¹ã¿ãã¯ãã¬ãŒã¹ãååŸããŸãã
$ perf record -e sched:sched_switch -e sched:sched_process_fork -e 'sched:sched_wak*'\
-k CLOCK_MONOTONIC --call-graph dwarf -- viztracer -o viztracer1-gil.json --ignore_frozen -m per4m.example1
Loading finish
Saving report to /home/maartenbreddels/github/maartenbreddels/per4m/viztracer1-gil.json ...
Dumping trace data to json, total entries: 94, estimated json file size: 11.0KiB
Report saved.
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0,991 MB perf.data (164 samples) ]
ïŒããã©ãŒãã³ã¹äžã®çç±ã§è¿œå ãã
--no-inline
ïŒperfã¹ã¯ãªããã®çµæãèŠããšã å€ãã®æ å ±ãåŸãããŸããç¶æ 倿Žã€ãã³ããèŠãŠãã ãããtake_gilãåŒã³åºãããããšã ããããŸãã
é ãããã¹ã
$ perf script --no-inline | less
...
viztracer 3306851 [059] 5614683.022539: sched:sched_switch: prev_comm=viztracer prev_pid=3306851 prev_prio=120 prev_state=S ==> next_comm=swapper/59 next_pid=0 next_prio=120
ffffffff96ed4785 __sched_text_start+0x375 ([kernel.kallsyms])
ffffffff96ed4785 __sched_text_start+0x375 ([kernel.kallsyms])
ffffffff96ed4b92 schedule+0x42 ([kernel.kallsyms])
ffffffff9654a51b futex_wait_queue_me+0xbb ([kernel.kallsyms])
ffffffff9654ac85 futex_wait+0x105 ([kernel.kallsyms])
ffffffff9654daff do_futex+0x10f ([kernel.kallsyms])
ffffffff9654dfef __x64_sys_futex+0x13f ([kernel.kallsyms])
ffffffff964044c7 do_syscall_64+0x57 ([kernel.kallsyms])
ffffffff9700008c entry_SYSCALL_64_after_hwframe+0x44 ([kernel.kallsyms])
7f4884b977b1<a href="https://www.maartenbreddels.com/cdn-cgi/l/email-protection"> [email protected]@GLIBC_2.3.2+0x271 (/usr/lib/x86_64-linux-gnu/libpthread-2.31.so)
55595c07fe6d take_gil+0x1ad (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfaa0b3 PyEval_RestoreThread+0x23 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c000872 lock_PyThread_acquire_lock+0x1d2 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfe71f3 _PyMethodDef_RawFastCallKeywords+0x263 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfe7313 _PyCFunction_FastCallKeywords+0x23 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c01d657 call_function+0x3b7 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c060d00 _PyEval_EvalFrameDefault+0x610 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfb6db1 _PyEval_EvalCodeWithName+0x251 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfd6b00 _PyFunction_FastCallKeywords+0x520 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c01d334 call_function+0x94 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c060d00 _PyEval_EvalFrameDefault+0x610 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfb6db1 _PyEval_EvalCodeWithName+0x251 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfd6b00 _PyFunction_FastCallKeywords+0x520 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c01d334 call_function+0x94 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c060d00 _PyEval_EvalFrameDefault+0x610 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfd6766 _PyFunction_FastCallKeywords+0x186 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c01d334 call_function+0x94 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c060d00 _PyEval_EvalFrameDefault+0x610 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfd6766 _PyFunction_FastCallKeywords+0x186 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c060ae4 _PyEval_EvalFrameDefault+0x3f4 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfb6db1 _PyEval_EvalCodeWithName+0x251 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c074e5d builtin_exec+0x33d (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfe7078 _PyMethodDef_RawFastCallKeywords+0xe8 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfe7313 _PyCFunction_FastCallKeywords+0x23 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c066c39 _PyEval_EvalFrameDefault+0x6549 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfb77e0 _PyEval_EvalCodeWithName+0xc80 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfd6b62 _PyFunction_FastCallKeywords+0x582 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c01d334 call_function+0x94 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c060d00 _PyEval_EvalFrameDefault+0x610 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfd6766 _PyFunction_FastCallKeywords+0x186 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c01d334 call_function+0x94 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c060d00 _PyEval_EvalFrameDefault+0x610 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfd6766 _PyFunction_FastCallKeywords+0x186 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c01d334 call_function+0x94 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c060d00 _PyEval_EvalFrameDefault+0x610 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfd6766 _PyFunction_FastCallKeywords+0x186 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c060ae4 _PyEval_EvalFrameDefault+0x3f4 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfb6db1 _PyEval_EvalCodeWithName+0x251 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595bfb81e2 PyEval_EvalCode+0x22 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c0c51d1 run_mod+0x31 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c0cf31d PyRun_FileExFlags+0x9d (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c0cf50a PyRun_SimpleFileExFlags+0x1ba (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c0d05f0 pymain_main+0x3e0 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
55595c0d067b _Py_UnixMain+0x3b (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
7f48849bc0b2 __libc_start_main+0xf2 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
55595c075100 _start+0x28 (/home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
âŠ
泚ïŒå¥å ãä»ã®ãã¥ãŒããã¯ã¹ã«èå³ãããå Žåã¯
pthread_cond_timedwait
ãhttpsïŒ //github.com/sumerc/gilstats.pyã«ãã£ãŠeBPFã«äœ¿çšãã ãŸãã
å¥ã®æ³šæïŒããã«ã¯Pythonã¹ã¿ãã¯ãã¬ãŒã¹ããªãããšã«æ³šæããŠãã ãã
_PyEval_EvalFrameDefault
ã代ããã«ãããã«å€ãã®ãã¬ãŒã¹ãååŸã ãŸãããå°æ¥çã«ã¯ãã¹ã¿ãã¯ãã¬ãŒã¹ã®æ¿å ¥æ¹æ³ãæžãäºå®ã§ãã
倿ããŒã«
per4m perf2trace
ã¯ãããçè§£ãããã¬ãŒã¹ã«
take_gil
次ã®ãã®ãå«ãŸããŠããå Žåã«ç°ãªãçµæãçæããŸã ã
$ perf script --no-inline | per4m perf2trace sched -o perf1-gil.json
Wrote to perf1-gil.json
$ viztracer --combine perf1-gil.json viztracer1-gil.json -o example1-gil.html
Saving report to /home/maartenbreddels/github/maartenbreddels/per4m/example1.html ...
Dumping trace data to json, total entries: 131, estimated json file size: 15.4KiB
Generating HTML report
Report saved.
æã ãåŸãïŒ
ããã§ãGILãã©ãã§æ©èœããããæ£ç¢ºã«ç¢ºèªã§ããŸãã
ãããŒãã³ã°ãéããŠïŒkprobes / uprobesïŒ
ããã»ã¹ããã€ã¹ãªãŒãç¶æ ã«ãªããïŒGILãŸãã¯ãã®ä»ã®çç±ã«ããïŒã¯ããã£ãŠããŸãããGILããã€ãªã³ãŸãã¯ãªãã«ãªããã«ã€ããŠè©³ããç¥ãããå Žåã¯ãçµæããã€åŒã³åºãããŠè¿ãããããç¥ãå¿ èŠã
take_gil
ãã
drop_gil
ãŸãããã®ãã¬ãŒã¹ã¯ãperfã§ãããŒãããããšã§ååŸã§ããŸãããŠãŒã¶ãŒç°å¢ã§ã¯ããããŒãã¯ã¢ããããŒãã§ãããkprobeã«é¡äŒŒããŠããŸããããã¯ããæ³åã®ãšãããã«ãŒãã«ç°å¢ã§æ©èœããŸããç¹°ãè¿ãã«ãªããŸãããJuliaEvansã¯è¿œå æ å ±ã®åªããæ å ±æºã§ã ã
4ã€ã®ãããŒããã€ã³ã¹ããŒã«ããŸãã
sudo perf probe -f -x `which python` python:take_gil=take_gil
sudo perf probe -f -x `which python` python:take_gil=take_gil%return
sudo perf probe -f -x `which python` python:drop_gil=drop_gil
sudo perf probe -f -x `which python` python:drop_gil=drop_gil%return
Added new events:
python:take_gil (on take_gil in /home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
python:take_gil_1 (on take_gil in /home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
You can now use it in all perf tools, such as:
perf record -e python:take_gil_1 -aR sleep 1
Failed to find "take_gil%return",
because take_gil is an inlined function and has no return point.
Added new event:
python:take_gil__return (on take_gil%return in /home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
You can now use it in all perf tools, such as:
perf record -e python:take_gil__return -aR sleep 1
Added new events:
python:drop_gil (on drop_gil in /home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
python:drop_gil_1 (on drop_gil in /home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
You can now use it in all perf tools, such as:
perf record -e python:drop_gil_1 -aR sleep 1
Failed to find "drop_gil%return",
because drop_gil is an inlined function and has no return point.
Added new event:
python:drop_gil__return (on drop_gil%return in /home/maartenbreddels/miniconda/envs/dev/bin/python3.7)
You can now use it in all perf tools, such as:
perf record -e python:drop_gil__return -aR sleep 1
ããã€ãã®äžæºããããã€ã³ã©ã€ã³ã®ãã®ã®ããã«
drop_gil
ã
take_gil
ããã€ãã®ãããŒã/ã€ãã³ãã远å ãããŸããïŒã€ãŸãã颿°ã¯ãã€ããªãã¡ã€ã«ã§æ°åæç€ºãããŸãïŒãããã¹ãŠãæ©èœããŸãã
泚ïŒå¯Ÿå¿ãã
take_gil
/
drop_gil
ïŒããã³ãã®çµæïŒããã®åé¡ã解決ããããã«æ£åžžã«æ©èœããããã«ãPythonãã€ããªïŒconda-forgeããïŒãã³ã³ãã€ã«ãããã®ã¯å¹žéã ã£ããããããŸãã ã
ãããŒãã¯ããã©ãŒãã³ã¹ã«åœ±é¿ãäžããããããŒãããã¢ã¯ãã£ããã§ããå ŽåïŒããšãã°ãããã©ãŒãã³ã¹ãããããŒããç£èŠããå ŽåïŒã«ã®ã¿ãå¥ã®ãã©ã³ãã§ã³ãŒããå®è¡ããããšã«æ³šæããŠãã ãããç£èŠäžã«ã圱é¿ãåããããŒãžãç£èŠå¯Ÿè±¡ããã»ã¹çšã«ã³ããŒããã ãã§ãã¯ãã€ã³ããé©åãªå Žæã«æ¿å ¥ãããŸãïŒx86ããã»ããµã®å Žåã¯INT3ïŒããã§ãã¯ãã€ã³ãã¯ããªãŒããŒããããã»ãšãã©ãªãããã©ãŒãã³ã¹ã®ã€ãã³ããçºçãããŸãããããŒããåé€ããå Žåã¯ã次ã®ã³ãã³ããå®è¡ããŸãã
$ sudo perf probe --del 'python*'
ããã§ãperfã¯ãªãã¹ã³ã§ããæ°ããã€ãã³ããèªèããã®ã§ã远å ã®åŒæ°ãæå®ããŠããäžåºŠå®è¡ããŠã¿ãŸããã
-e 'python:*gil*'
ã
$ perf record -e sched:sched_switch -e sched:sched_process_fork -e 'sched:sched_wak*' -k CLOCK_MONOTONIC \
-e 'python:*gil*' -- viztracer -o viztracer1-uprobes.json --ignore_frozen -m per4m.example1
泚ïŒåé€ããŸãããåé€ã
--call-graph dwarf
ãªããšãperfãéã«åãããã€ãã³ãã倱ãããŸãã
次ã«ãper4m perf2traceã䜿çšããŠãVizTracerã§çè§£ã§ããJSONã«å€æãããšåæã«ãæ°ããçµ±èšãååŸããŸãã
$ perf script --no-inline | per4m perf2trace gil -o perf1-uprobes.json
...
Summary of threads:
PID total(us) no gil% has gil% gil wait%
-------- ----------- ----------- ------------ -------------
3353567* 164490.0 65.9 27.3 6.7
3353569 66560.0 0.3 48.2 51.5
3353570 60900.0 0.0 56.4 43.6
High 'no gil' is good, we like low 'has gil',
and we don't want 'gil wait'. (* indicates main thread)
...
Wrote to perf1-uprobes.json
ãµãã³ãã³ã
per4m perf2trace gil
ã¯ãçµæãšããŠgil_loadãæäŸã ãŸãããã®ããšãããäºæ³ã©ãããäž¡æ¹ã®ã¹ã¬ãããGILãçŽååã®æéåŸ æ©ããŠããããšãããããŸãã
perfã«ãã£ãŠèšé²ãããåãperf.dataãã¡ã€ã«ã䜿çšããŠãã¹ã¬ãããŸãã¯ããã»ã¹ã®ç¶æ ã«é¢ããæ å ±ãçæããããšãã§ããŸããããããã¹ã¿ãã¯ãã¬ãŒã¹ãªãã§å®è¡ããŠãããããGILãåå ã§ããã»ã¹ãã¹ãªãŒããããã©ããã¯ããããŸããã
$ perf script --no-inline | per4m perf2trace sched -o perf1-state.json
Wrote to perf1-state.json
æåŸã«ã3ã€ã®çµæãã¹ãŠããŸãšããŸãããã
$ viztracer --combine perf1-state.json perf1-uprobes.json viztracer1-uprobes.json -o example1-uprobes.html Saving report to /home/maartenbreddels/github/maartenbreddels/per4m/example1-uprobes.html ... Dumping trace data to json, total entries: 10484, estimated json file size: 1.2MiB Generating HTML report Report saved.
VizTracerã¯ã誰ãGILãæã£ãŠããŠã誰ããããåŸ ã£ãŠãããã«ã€ããŠã®è¯ãã¢ã€ãã¢ãæäŸããŸãïŒ
åã¹ã¬ããã®äžã«ã¯ãã¹ã¬ãããŸãã¯ããã»ã¹ãGILãåŸ æ©ãããªã³ã«ãªã£ãŠããïŒLOCKãšããŠããŒã¯ãããŠããïŒãšãã«æžã蟌ãŸããŸãããããã®æéã¯ãã¹ã¬ãããŸãã¯ããã»ã¹ãèµ·åããŠããïŒå®è¡äžïŒïŒæéãšéè€ããŠããããšã«æ³šæããŠãã ãã ããŸããGILãåå ã§ããã¯ããªã®ã§ãå®è¡ç¶æ ã®ã¹ã¬ãããŸãã¯ããã»ã¹ã¯1ã€ãã衚瀺ãããªãããšã«æ³šæããŠãã ããã
åŒã³åºãéã®æé
take_gil
ãã€ãŸãããã¯éã®æéïŒãããã£ãŠãã¹ãªãŒããŸãã¯ãŠã§ã€ã¯ã¢ããéã®æé ïŒã¯ãäžèšã®gil waitïŒ åã®è¡šãšãŸã£ããåãã§ããLOCKãšã©ãã«ä»ããããåã¹ã¬ããã®GILã¿ãŒã³ãªã³æéã¯ãgilïŒ åã®æéã«å¯Ÿå¿ããŸãã
ã¯ã©ãŒã±ã³ãè§£æŸãã... ghmãGIL
çŽç²ãªPythonããã°ã©ã ããã«ãã¹ã¬ããåãããŠããå ŽåãGILãäžåºŠã«1ã€ã®ã¹ã¬ãããŸãã¯ããã»ã¹ã®ã¿ãå®è¡ã§ããããã«ããããšã§ããã©ãŒãã³ã¹ãå¶éããæ¹æ³ãèŠãŠããŸããïŒãã¡ããã1ã€ã®Pythonããã»ã¹ã«å¯ŸããŠããããŠå°æ¥çã«ã¯1ã€ã®ïŒãµãïŒã€ã³ã¿ãŒããªã¿ãŒã«å¯ŸããŠïŒ ã NumPy颿°ã®å®è¡æã«çºçããããã«ãGILãç¡å¹ã«ãããšã©ããªãããèŠãŠã¿ãŸãããã
2çªç®ã®äŸ
some_numpy_computation
ã§ã¯ã2ã€ã®ã¹ã¬ããã§NumPy颿°M = 4ã䞊åã«åŒã³åºããã¡ã€ã³ã¹ã¬ããã¯çŽç²ãªPythonã³ãŒããå®è¡ããŸãã
import threading
import time
import numpy as np
N = 1024*1024*32
M = 4
x = np.arange(N, dtype='f8')
def some_numpy_computation():
total = 0
for i in range(M):
total += x.sum()
return total
def main(args=None):
thread1 = threading.Thread(target=some_numpy_computation)
thread2 = threading.Thread(target=some_numpy_computation)
thread1.start()
thread2.start()
total = 0
for i in range(2_000_000):
total += i
for thread in [thread1, thread2]:
thread.join()
main()
ãã®ã¹ã¯ãªãããperfãšVizTracerã§å®è¡ãã代ããã«
per4m giltracer
ãäžèšã®ãã¹ãŠã®æé ãèªååãããŠãŒãã£ãªãã£ã䜿çšã ãŸãã圌女ã¯ãããå°ãè³¢ãããŸããåºæ¬çã«ãperfã2åå®è¡ããŸãã1åç®ã¯ã¹ã¿ãã¯ãã¬ãŒã¹ãªãã§ã2åç®ã¯ã¹ã¿ãã¯ãã¬ãŒã¹ããã§å®è¡ããã¡ã€ã³é¢æ°ãå®è¡ããåã«ã¢ãžã¥ãŒã«/ã¹ã¯ãªãããã€ã³ããŒãããŠãåãã€ã³ããŒãã®ãããªèå³ã®ãªããã¬ãŒã¹ãåãé€ããŸããããã¯ãã€ãã³ãã倱ãããšããªãããã«ãååã«è¿ éã«è¡ãããŸãã
$ giltracer --state-detect -o example2-uprobes.html -m per4m.example2 ...
ã¹ããªãŒã ã®åèšïŒ
PID total(us) no gil% has gil% gil wait%
-------- ----------- ----------- ------------ -------------
3373601* 1359990.0 95.8 4.2 0.1
3373683 60276.4 84.6 2.2 13.2
3373684 57324.0 89.2 1.9 8.9
High 'no gil' is good, we like low 'has gil',
and we don't want 'gil wait'. (* indicates main thread)
...
Saving report to /home/maartenbreddels/github/maartenbreddels/per4m/example2-uprobes.html ...
...
ã¡ã€ã³ã¹ã¬ããã¯Pythonã³ãŒããå®è¡ããŸããïŒGILãæå¹ã«ãªã£ãŠãããLOCKãšããåèªã§ç€ºãããŸãïŒãä»ã®ã¹ã¬ããã䞊è¡ããŠå®è¡ãããŸããçŽç²ãªPythonã®äŸã§ã¯ã1ã€ã®ã¹ã¬ãããŸãã¯ããã»ã¹ãåæã«å®è¡ãããŠããããšã«æ³šæããŠãã ããããããŠããã§ã¯ãå®éã«ã¯3ã€ã®ã¹ã¬ããã䞊è¡ããŠå®è¡ãããŸãããããå¯èœãªã®ã¯ãC / C ++ / Fortranã«å«ãŸããŠããNumPyã«ãŒãã³ãGILãç¡å¹ã«ããããã§ãã
ãã ããGILã¯ã¹ã¬ããã«åœ±é¿ãäžããŸããããã¯ãNumPy颿°ãPythonã«æ»ããšãã«ãé·ããããã¯ã«èŠãããããã«ãGILãå床ååŸããå¿ èŠããããã
take_gil
ã§ããããã«ã¯ãåã¹ã¬ããã®æéã®10ïŒ ãããããŸãã
Jupyterçµ±å
ç§ã®ã¯ãŒã¯ãããŒã§ã¯ãLinuxãã·ã³ã«ãªã¢ãŒãæ¥ç¶ãããMacBookïŒperfã¯å®è¡ãããŸããããdtraceããµããŒãããŸãïŒããªã¢ãŒãã§å®è¡ããããšãå€ããããJupyterããŒãããã¯ã䜿çšããŠã³ãŒãããªã¢ãŒãã§å®è¡ããŸãããããŠãç§ã¯Jupyteréçºè ãªã®ã§ãã§ã©ãããŒãäœæããå¿ èŠããããŸãã
cell magic
ã
# this registers the giltracer cell magic
%load_ext per4m
%%giltracer
# this call the main define above, but this can also be a multiline code cell
main()
Saving report to /tmp/tmpvi8zw9ut/viztracer.json ...
Dumping trace data to json, total entries: 117, estimated json file size: 13.7KiB
Report saved.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0,094 MB /tmp/tmpvi8zw9ut/perf.data (244 samples) ]
Wait for perf to finish...
perf script -i /tmp/tmpvi8zw9ut/perf.data --no-inline --ns | per4m perf2trace gil -o /tmp/tmpvi8zw9ut/giltracer.json -q -v
Saving report to /home/maartenbreddels/github/maartenbreddels/fastblog/_notebooks/giltracer.html ...
Dumping trace data to json, total entries: 849, estimated json file size: 99.5KiB
Generating HTML report
Report saved.
giltracer.htmlãããŠã³ããŒã
ããæ°ããã¿ãã§giltracer.htmlãéããŸãïŒã»ãã¥ãªãã£ã®ããã«æ©èœããªãå ŽåããããŸãïŒ
çµè«
perfã䜿çšãããšãããã»ã¹ãŸãã¯ã¹ã¬ããã®ç¶æ ãå€å¥ã§ããŸããããã¯ãPythonã§GILãæå¹ã«ãªã£ãŠããããã»ã¹ãŸãã¯ã¹ã¬ãããçè§£ããã®ã«åœ¹ç«ã¡ãŸãããŸããã¹ã¿ãã¯ãã¬ãŒã¹ã䜿çšãããšãGILãã¹ãªãŒãã®åå ã§ããã
time.sleep
ããšãã°ã§ã¯ãªãããšã確èªã§ããŸã ã
PERFã§uprobesãçµã¿åãããããšã§ãããªããåŒã³åºãããã¬ãŒã¹ãã颿°ã®çµæãè¿ãããšãã§ãã
take_gil
ãš ç²åŸ
drop_gil
ããªãã®Pythonããã°ã©ã äžã®GILã®åœ±é¿ã«ããã«å€ãã®æŽå¯ãã
ç§ãã¡ã®äœæ¥ã¯ãperfã¹ã¯ãªãããJSON圢åŒã®VizTracerã«å€æããå®éšçãªper4mããã±ãŒãžãšãããã€ãã®ãªãŒã±ã¹ãã¬ãŒã·ã§ã³ããŒã«ã«ãã£ãŠä¿é²ãããŸãã
ããããã®ãã«ãããã¹ã¿ãŒããªãã£ã
GILã®åœ±é¿ã確èªãããã ãã®å Žåã¯ãããã1åå®è¡ããŸãã
sudo yum install perf
sudo sysctl kernel.perf_event_paranoid=-1
sudo mount -o remount,mode=755 /sys/kernel/debug
sudo mount -o remount,mode=755 /sys/kernel/debug/tracing
sudo perf probe -f -x `which python` python:take_gil=take_gil
sudo perf probe -f -x `which python` python:take_gil=take_gil%return
sudo perf probe -f -x `which python` python:drop_gil=drop_gil
sudo perf probe -f -x `which python` python:drop_gil=drop_gil%return
pip install "viztracer>=0.11.2" "per4m>=0.1,<0.2"
䜿çšäŸïŒ
# module
$ giltracer per4m/example2.py
# script
$ giltracer -m per4m.example2
# add thread/process state detection
$ giltracer --state-detect -m per4m.example2
# without uprobes (in case that fails)
$ giltracer --state-detect --no-gil-detect -m per4m.example2
ä»åŸã®èšç»
ãããã®ããŒã«ãéçºããå¿ èŠããªãã£ããããã®ã«ãšæããŸããããŸãããã°ãç§ã¯èª°ãã«ç§ãããåªãã補åãäœæããããã«ä¿ãããšãã§ããŸããã髿§èœã³ãŒãã®èšè¿°ã«éäžããããšæããŸããããããç§ã¯å°æ¥ã®ããã«ãã®ãããªèšç»ãæã£ãŠããŸãïŒ
- VizTracerã§ããã©ãŒãã³ã¹ã¡ãŒã¿ãŒã調ã¹ãŠããã£ãã·ã¥ãã¹ãããã»ã¹ã®ããŠã³ã¿ã€ã ã確èªããŸãã
- http://www.brendangregg.com/offcpuanalysis.htmlãªã©ã®ããŒã«ãšçµã¿åãããããã«ãããã©ãŒãã³ã¹ãã¬ãŒã¹ã«Pythonã¹ã¿ãã¯ãã¬ãŒã¹ãå®è£ ããŸã
- macOSã§äœ¿çšããã«ã¯ãdtraceã䜿çšããŠåãæŒç¿ãç¹°ãè¿ããŸãã
- ã©ã®C颿°ãGILãç¡å¹ã«ããªãããèªåçã«æ€åºããŸãïŒhttps://github.com/vaexio/vaex/pull/1114ãhttps://github.com/apache/arrow/pull/7756ïŒ
- , https://github.com/h5py/h5py/issues/1516