æ©æ¢°åŠç¿ã¯ãH20ãTPOTãauto-sklearnãªã©ã®ããŒã«ã䜿çšããŠãæåèšèšã¢ãã«ããèªåçã«æé©åããããã€ãã©ã€ã³ã«ãŸããŸãã·ããããŠããŸãããããã®ã©ã€ãã©ãªã¯ãã©ã³ãã æ€çŽ¢ãªã©ã®ææ³ãšãšãã«ãæåã«ããä»å ¥ãªãã«ããŒã¿ã»ããã«æé©ãªã¢ãã«ãèŠã€ããããšã«ãããã¢ãã«ã®éžæãšæ©æ¢°åŠç¿ã®äžéšã®èª¿æŽãç°¡çŽ åããããšãç®çãšããŠããŸãããã ãããªããžã§ã¯ãéçºãããããæ©æ¢°åŠç¿ãã€ãã©ã€ã³ã®ãã䟡å€ã®ããåŽé¢ã¯ãã»ãŒå®å šã«äººéã®ãŸãŸã§ãã
èšèšæ©èœïŒæ©èœãšã³ãžãã¢ãªã³ã°ïŒã¯ããã£ãŒãã£ãŒäœæãšãåŒã°ããæ©æ¢°åŠç¿ã¢ãã«ããã¬ãŒãã³ã°ããããã«æ¢åã®ããŒã¿ããæ°ãããã£ãŒãã£ãŒãäœæããããã»ã¹ã§ããæ©æ¢°åŠç¿ã¢ã«ãŽãªãºã ãæäŸããããŒã¿ããã®ã¿åŠç¿ããã¿ã¹ã¯ã«é¢é£ããæ©èœãäœæããããšã絶察ã«å¿ èŠã§ããããããã®ã¹ãããã¯å®éã«äœ¿çšãããã¢ãã«ãããéèŠã§ããå¯èœæ§ããããŸãïŒåªããèšäºãããã€ãã®äŸ¿å©ãªãã®ããåç §ïŒæ©æ¢°åŠç¿ã«ã€ããŠç¥ã£ãŠããã¹ãããš "ïŒã
éåžžãæ©èœéçºã¯ããã¡ã€ã³ã®ç¥èãçŽæãããã³ããŒã¿æäœã«åºã¥ããé·ãæåããã»ã¹ã§ãããã®ããã»ã¹ã¯éåžžã«éå±ã§ãæçµçãªç¹æ§ã¯äººéã®äž»èŠ³æ§ãšæéã®äž¡æ¹ã«ãã£ãŠå¶éãããŸããèªåæ©èœèšèšã¯ãããŒã¿ãµã€ãšã³ãã£ã¹ããããŒã¿ã»ããããå€ãã®åè£ãªããžã§ã¯ããèªåçã«äœæããããããæé©ãªãã®ãéžæããŠãã¬ãŒãã³ã°ã«äœ¿çšã§ããããã«ããããšãç®çãšããŠããŸãã
ãã®èšäºã§ã¯ãPython featuretoolsã©ã€ãã©ãªã§èªåæ©èœéçºã䜿çšããäŸãèŠãŠãããŸãã... ãµã³ãã«ããŒã¿ã»ããã䜿çšããŠåºæ¬ã瀺ããŸãïŒå®éã®ããŒã¿ã䜿çšããä»åŸã®æçš¿ã«æ³šæããŠãã ããïŒããã®èšäºã®å®å šãªã³ãŒãã¯GitHubã§å ¥æã§ããŸãã
æ©èœéçºã®åºæ¬
ç¹æ§ã®éçºãšã¯ãæ¢åã®ããŒã¿ããè¿œå ã®ç¹æ§ãäœæããããšãæå³ããå€ãã®å Žåãé¢é£ããè€æ°ã®ããŒãã«ã«åæ£ããŠããŸããæ©èœéçºã§ã¯ãããŒã¿ããé¢é£æ å ±ãæœåºãããããåäžã®ããŒãã«ã«å ¥ããŠãæ©æ¢°åŠç¿ã¢ãã«ã®ãã¬ãŒãã³ã°ã«äœ¿çšã§ããããã«ããå¿ èŠããããŸãã
ç¹ã«è€æ°ã®ããŒãã«ã®æ å ±ã䜿çšããå Žåã¯ãæ°ããç¹æ§ãäœæããããã«éåžžããã€ãã®æé ãå®è¡ãããããç¹æ§ã®äœæããã»ã¹ã«ã¯éåžžã«æéãããããŸãããã£ãŒãã£ãŒäœææäœã2ã€ã®ã«ããŽãªãŒã«ã°ã«ãŒãåã§ããŸããå€æãšéçŽã§ããããã€ãã®äŸãèŠãŠããããã®æŠå¿µã®åäœãèŠãŠã¿ãŸãããã
å€æåäžã®ããŒãã«ïŒPythonã®çšèªã§ã¯ãããŒãã«ã¯åãªããã³ã
DataFrame
ïŒã«äœçšãã1ã€ä»¥äžã®æ¢åã®åããæ°ããæ©èœãäœæããŸããããšãã°ã以äžã®é¡§å®¢ããŒãã«ãããå Žåã
åããæãèŠã€ããããå
joined
ããèªç¶å¯Ÿæ°ãåãããšã§ç¹åŸŽãäœæã§ããŸãincome
ããããã¯1ã€ã®ããŒãã«ããã®æ
å ±ã®ã¿ã䜿çšãããããã©ã¡ããå€æã§ãã
äžæ¹ãéèšã¯ããŒãã«éã§å®è¡ããã1察å€ã®é¢ä¿ã䜿çšããŠã±ãŒã¹ãã°ã«ãŒãåããçµ±èšãèšç®ããŸããããšãã°ãå顧客ãè€æ°ã®ããŒã³ãæã€ããšãã§ãã顧客ããŒã³ã«é¢ããæ å ±ãå«ãå¥ã®ããŒãã«ãããå Žåãå顧客ã®å¹³åãæ倧ããã³æå°ã®ããŒã³å€ãªã©ã®çµ±èšãèšç®ã§ããŸãã
ãã®ããã»ã¹ã«ã¯ã顧客ããšã«ããŒã³ããŒãã«ãã°ã«ãŒãåããéèšãèšç®ããåä¿¡ããããŒã¿ã顧客ããŒã¿ãšçµã¿åãããããšãå«ãŸããŸããããããPandasèšèªã䜿çšããŠPythonã§å®è¡ã§ããæ¹æ³ã§ãã
import pandas as pd
# Group loans by client id and calculate mean, max, min of loans
stats = loans.groupby('client_id')['loan_amount'].agg(['mean', 'max', 'min'])
stats.columns = ['mean_loan_amount', 'max_loan_amount', 'min_loan_amount']
# Merge with the clients dataframe
stats = clients.merge(stats, left_on = 'client_id', right_index=True, how = 'left')
stats.head(10)
ãããã®æäœèªäœã¯è€éã§ã¯ãããŸããããæ°åã®ããŒãã«ã«åæ£ããæ°çŸã®å€æ°ãããå Žåããã®ããã»ã¹ãæåã§è¡ãããšã¯ã§ããŸãããçæ³çã«ã¯ãè€æ°ã®ããŒãã«ã«å¯ŸããŠå€æãšéèšãèªåçã«å®è¡ããçµæã®ããŒã¿ã1ã€ã®ããŒãã«ã«çµåã§ãããœãªã¥ãŒã·ã§ã³ãå¿ èŠã§ããPandasã¯çŽ æŽããããªãœãŒã¹ã§ãããæåã§å®è¡ãããããŒã¿æäœã¯ãŸã ãããããããŸãïŒïŒæ©èœã®æåèšèšã®è©³çŽ°ã«ã€ããŠã¯ãåªããPython Data Science Handbookãåç §ããŠãã ãããïŒ
Featuretools
ãããããfeaturetoolsã¯ãŸãã«ç§ãã¡ãæ±ããŠãããœãªã¥ãŒã·ã§ã³ã§ãããã®ãªãŒãã³ãœãŒã¹ã®Pythonã©ã€ãã©ãªã¯ãé¢é£ããäžé£ã®ããŒãã«ããå€ãã®ç¹æ§ãèªåçã«çæããŸãã Featuretoolsã¯ãããã£ãŒããã£ãŒãã£ãŒåæããšåŒã°ããææ³ã«åºã¥ããŠããŸããããã¯ãå®éãããã¯ããã«å°è±¡çã«èãããŸãïŒååã¯ããã£ãŒãã©ãŒãã³ã°ã䜿çšããŠããããã§ã¯ãªããè€æ°ã®æ©èœãçµã¿åãããããã§ãïŒïŒã
Deep Feature Synthesisã¯ãããã€ãã®å€æããã³éçŽæäœïŒæ©èœããªããã£ããšåŒã°ããŸãïŒãçµã¿åããFeatureToolsèŸæžã§ïŒå€ãã®ããŒãã«ã«åæ£ããããŒã¿ããæ©èœãäœæããŸããæ©æ¢°åŠç¿ã®ã»ãšãã©ã®ã¢ã€ãã¢ãšåæ§ã«ãããã¯åçŽãªæŠå¿µã«åºã¥ãè€éãªæ¹æ³ã§ããäžåºŠã«1ã€ã®ãã«ãã£ã³ã°ãããã¯ãç 究ããããšã§ããã®åŒ·åãªææ³ãããç解ã§ããŸãã
æåã«ãäŸããã®ããŒã¿ãèŠãŠã¿ãŸããããäžèšã®ããŒã¿ã»ãããããã§ã«äœããèŠãŠããŸãããããŒãã«ã®å®å šãªã»ããã¯æ¬¡ã®ããã«ãªããŸãã
clients
ïŒä¿¡çšçµåã®ã¯ã©ã€ã¢ã³ãã«é¢ããåºæ¬æ å ±ãåã¯ã©ã€ã¢ã³ãã¯ããã®ããŒã¿ãã¬ãŒã ã«1è¡ãããããŸãã
loans
ïŒé¡§å®¢ãžã®ããŒã³ããã®ããŒã¿ãã¬ãŒã ã§ã¯ãã¯ã¬ãžããããšã«ç¬èªã®è¡ãããããŸãããã顧客ã¯è€æ°ã®ã¯ã¬ãžãããæã€ããšãã§ããŸãã
payments
ïŒ ããŒã³ã®æ¯æããåæ¯æãã«ã¯1è¡ãããããŸããããåããŒã³ã«ã¯è€æ°ã®æ¯æãããããŸãã
顧客ãå°æ¥ã®ããŒã³ãè¿æžãããã©ãããäºæž¬ãããªã©ã®æ©æ¢°åŠç¿ã¿ã¹ã¯ãããå Žåããã¹ãŠã®é¡§å®¢æ å ±ã1ã€ã®ããŒãã«ã«çµåããŸããããŒãã«ã¯ïŒ
client_id
ããã³å€æ°ãä»ããŠloan_id
ïŒãªã³ã¯ãããŠãããäžé£ã®å€æãšéèšã䜿çšããŠæåã§ããã»ã¹ãå®äºããããšãã§ããŸãããã ããfeaturetoolsã䜿çšããŠããã»ã¹ãèªååã§ããããšãããã«ããããŸãã
ãšã³ãã£ãã£ãšãšã³ãã£ãã£ã»ããïŒãšã³ãã£ãã£ãšãšã³ãã£ãã£ã»ããïŒ
featuretoolsã®æåã®2ã€ã®æŠå¿µã¯ããšã³ãã£ãã£ãšãšã³ãã£ãã£ã»ããã§ãããšã³ãã£ãã£ã¯åãªãããŒãã«ã§ãïŒãŸãã¯
DataFrame
ãã³ãã§èããå ŽåïŒãEntitySetã¯ãããŒãã«ãšããŒãã«éã®é¢ä¿ã®ã³ã¬ã¯ã·ã§ã³ã§ãã entitysetãç¬èªã®ã¡ãœãããšå±æ§ãæã€åãªãPythonããŒã¿æ§é ã§ãããšæ³åããŠãã ããã
次ã®ããã«ããŠãfeaturetoolsã§ç©ºã®ãšã³ãã£ãã£ã»ãããäœæã§ããŸãã
import featuretools as ft
# Create new entityset
es = ft.EntitySet(id = 'clients')
次ã«ããšã³ãã£ãã£ãè¿œå ããå¿ èŠããããŸããåãšã³ãã£ãã£ã«ã¯ããã¹ãŠã®äžæã®èŠçŽ ãå«ãåã§ããã€ã³ããã¯ã¹ãå¿ èŠã§ããã€ãŸããã€ã³ããã¯ã¹ã®åå€ã¯ãããŒãã«ã«1åã ãåºçŸããå¿ èŠããããŸããããŒã¿ãã¬ãŒã å ã®ã€ã³ããã¯ã¹
clients
ã¯client_id
ãåã¯ã©ã€ã¢ã³ãããã®ããŒã¿ãã¬ãŒã å
ã«1è¡ãããªãããã§ãã次ã®æ§æã䜿çšããŠãæ¢åã®ã€ã³ããã¯ã¹ãæã€ãšã³ãã£ãã£ããšã³ãã£ãã£ã»ããã«è¿œå ããŸãã
# Create an entity from the client dataframe
# This dataframe already has an index and a time index
es = es.entity_from_dataframe(entity_id = 'clients', dataframe = clients,
index = 'client_id', time_index = 'joined')
ããŒã¿ãã¬ãŒã ã«
loans
ãäžæã®ã€ã³ããã¯ã¹loan_id
ãããããšã³ãã£ãã£ã»ããã«è¿œå ããããã®æ§æã¯ãšåãã§ãclients
ããã ããæ¯æãããŒã¿ãã¬ãŒã ã«ã¯äžæã®ã€ã³ããã¯ã¹ã¯ãããŸããããã®ãšã³ãã£ãã£ããšã³ãã£ãã£ã»ããã«è¿œå ãããšãã¯ããã©ã¡ãŒã¿ãŒãæž¡ãmake_index = True
ãŠã€ã³ããã¯ã¹åãæå®ããå¿
èŠããããŸããããã«ãfeaturetoolsã¯ãšã³ãã£ãã£ã®ååã®ããŒã¿åãèªåçã«æšæž¬ããŸãããååã®ãã£ã¯ã·ã§ããªããã©ã¡ãŒã¿ãŒã«æž¡ãããšã§ãããäžæžãã§ããŸãvariable_types
ã
# Create an entity from the payments dataframe
# This does not yet have a unique index
es = es.entity_from_dataframe(entity_id = 'payments',
dataframe = payments,
variable_types = {'missed': ft.variable_types.Categorical},
make_index = True,
index = 'payment_id',
time_index = 'payment_date')
ãã®ããŒã¿ãã¬ãŒã ã§
missed
ã¯ãæŽæ°ã§ãã£ãŠãã2ã€ã®é¢æ£å€ããåãããšãã§ããªããããæ°å€å€æ°ã§ã¯ãããŸããããããã£ãŠãfeaturetoolsã«ã«ããŽãªå€æ°ãšããŠæ±ãããã«æ瀺ããŸãããšã³ãã£ãã£ã»ããã«ããŒã¿ãã¬ãŒã ãè¿œå ããåŸããããã®ããããã調ã¹ãŸãã
åã¿ã€ãã¯ãæå®ããããªããžã§ã³ã§æ£ããæšè«ãããŸããã次ã«ããšã³ãã£ãã£ã»ããã®ããŒãã«ãã©ã®ããã«é¢é£ããŠãããã瀺ãå¿ èŠããããŸãã
ããŒãã«éã®é¢ä¿
2ã€ã®ããŒãã«é ã®é¢ä¿ãè¡šãæåã®æ¹æ³ã¯ã芪åã®é¡äŒŒæ§ã䜿çšããããšã§ãã 1察å€ã®é¢ä¿ïŒå芪ã¯è€æ°ã®åãæã€ããšãã§ããŸããããŒãã«é åã§ã¯ã芪ããŒãã«ã«ã¯èŠªããšã«1ã€ã®è¡ããããŸãããåããŒãã«ã«ã¯åã芪ã®è€æ°ã®åã«å¯Ÿå¿ããè€æ°ã®è¡ãå«ããããšãã§ããŸãã
ããšãã°ãããŒã¿ã»ããã§ã¯ã
clients
ãã¬ãŒã ã¯ãã¬ãŒã ã®èŠªã§ãloans
ãåã¯ã©ã€ã¢ã³ãã«ã¯ã«1è¡ãããããŸããclients
ããã«ã¯è€æ°ã®è¡ãå«ããããšãã§ããŸãloans
ãåæ§ã«ãloans
䞡芪ã¯payments
åããŒã³ã«ã¯è€æ°ã®æ¯æããããããã§ãã芪ã¯å
±éã®å€æ°ã«ãã£ãŠåã«ãªã³ã¯ãããŠããŸããéèšãè¡ããšãã¯ãåããŒãã«ã芪å€æ°ã§ã°ã«ãŒãåããå芪ã®åã«é¢ããçµ±èšãèšç®ããŸããfeaturetoolsã§é¢ä¿
ã圢åŒåããã«ã¯ã2ã€ã®ããŒãã«ããªã³ã¯ããå€æ°ãæå®ããã ãã§ãã
clients
ãããŠããŒãã«ãããloans
ãå€æ°ã«é¢é£ä»ããããŠãã client_id
ããš loans
ããšpayments
ã®å©ããåããŠ- loan_id
ãé¢ä¿ãäœæããŠãšã³ãã£ãã£ã»ããã«è¿œå ããããã®æ§æã以äžã«ç€ºããŸãã
# Relationship between clients and previous loans
r_client_previous = ft.Relationship(es['clients']['client_id'],
es['loans']['client_id'])
# Add the relationship to the entity set
es = es.add_relationship(r_client_previous)
# Relationship between previous loans and previous payments
r_payments = ft.Relationship(es['loans']['loan_id'],
es['payments']['loan_id'])
# Add the relationship to the entity set
es = es.add_relationship(r_payments)
es
ãšã³ãã£ãã£ã»ããã«ã¯ã3ã€ã®ãšã³ãã£ãã£ïŒããŒãã«ïŒãšãããã®ãšã³ãã£ãã£ãçµã³ä»ããé¢ä¿ãå«ãŸããŠããŸãããšã³ãã£ãã£ãè¿œå ããŠé¢ä¿ã圢åŒåãããšããšã³ãã£ãã£ã®ã»ãããå®æãããã£ãŒãã£ãäœæããæºåãæŽããŸãã
æ©èœããªããã£ã
ç¹æ§ã®æ·±ãçµ±åã«å®å šã«å ¥ãåã«ãç¹æ§ã®ããªããã£ããç解ããå¿ èŠããããŸããç§ãã¡ã¯ãã§ã«ããããäœã§ããããç¥ã£ãŠããŸãããç°ãªãååã§åŒã¶ã ãã§ãïŒãããã¯ãæ°ããæ©èœã圢æããããã«äœ¿çšããåºæ¬çãªæäœã§ãã
- éçŽïŒèŠªã«ãã£ãŠã°ã«ãŒãåãããåã®çµ±èšãèšç®ãã芪åé¢ä¿ïŒ1察å€ïŒã§å®è¡ãããæäœãäŸã§ã¯ãããŒãã«ãã°ã«ãŒãåãããŠãã
loans
ããšã«ããclient_id
ãåã¯ã©ã€ã¢ã³ãã®æ倧ããŒã³é¡ã決å®ããŸãã - å€æïŒ1ã€ã®ããŒãã«ãã1ã€ä»¥äžã®åã«å¯ŸããŠå®è¡ãããæäœãäŸã«ã¯ãåãããŒãã«å ã®2ã€ã®åã®å·®ããŸãã¯åã®çµ¶å¯Ÿå€ãå«ãŸããŸãã
æ°æ©èœã¯ããããã®ããªããã£ãã䜿çšããŠããŸãã¯è€æ°ã®ããªããã£ããšããŠãfeaturetoolsã§äœæãããŸãã以äžã¯ãfeaturetoolsã®ããã€ãã®ããªããã£ãã®ãªã¹ãã§ãïŒã«ã¹ã¿ã ããªããã£ããå®çŸ©ããããšãã§ããŸãïŒã
ãããã®ããªããã£ãã¯ãåç¬ã§äœ¿çšããããšããçµã¿åãããŠæ©èœãäœæããããšãã§ããŸããæå®ãããããªããã£ãã§ãã£ãŒãã£ãŒãäœæããã«ã¯ãé¢æ°
ft.dfs
ïŒãã£ãŒããã£ãŒãã£ãŒåæã®ç¥ïŒã䜿çšããŸãããšã³ãã£ãã£ã®ã»ãããæž¡ãtarget_entity
ãŸããããã¯ãéžæãããæ©èœ trans_primitives
ïŒå€æïŒãšagg_primitives
ïŒéèšïŒãè¿œå ããããŒãã«ã§ãã
# Create new features using specified primitives
features, feature_names = ft.dfs(entityset = es, target_entity = 'clients',
agg_primitives = ['mean', 'max', 'percent_true', 'last'],
trans_primitives = ['years', 'month', 'subtract', 'divide'])
çµæã¯ãåã¯ã©ã€ã¢ã³ãã®æ°æ©èœã®ããŒã¿ãã¬ãŒã ã§ãïŒã¯ã©ã€ã¢ã³ããäœæãããã
target_entity
ïŒãããšãã°ãåã¯ã©ã€ã¢ã³ããåå ããæããããããã¯å€æããªããã£ãã§ãã
ãŸããåã¯ã©ã€ã¢ã³ãã®å¹³åæ¯æãé¡ãªã©ãããã€ãã®éçŽããªããã£ãããããŸãã
ããã€ãã®ããªããã£ããæå®ããã ãã§ãããfeaturetoolsã¯ãããã®ããªããã£ããçµã¿åãããŠã¹ã¿ãã¯ããããšã«ãããå€ãã®æ°ããæ©èœãäœæããŸããã
å®å šãªããŒã¿ãã¬ãŒã ã«ã¯ã793åã®æ°æ©èœãå«ãŸããŠããŸãã
ãµã€ã³ã®ãã£ãŒãã·ã³ã»ã·ã¹
ããã§ããã£ãŒããã£ãŒãã£ã·ã³ã»ã·ã¹ïŒdfsïŒãç解ããããã®ãã¹ãŠãæããŸãããå®éãåã®é¢æ°åŒã³åºãã§ãã§ã«dfsãå®è¡ããŸããïŒæ·±ãç¹æ§ã¯ãè€æ°ã®ããªããã£ãã®çµã¿åããã§æ§æãããç¹æ§ã§ãããdfsã¯ããããã®ç¹æ§ãäœæããããã»ã¹ã®ååã§ãããã£ãŒããã£ãŒãã£ã®æ·±ãã¯ããã£ãŒãã£ãäœæããããã«å¿ èŠãªããªããã£ãã®æ°ã§ãã
ããšãã°ãå
MEAN (payment.payment_amount)
ã¯åäžã®éèšã䜿çšããŠäœæããããããæ·±ãã1ã®æ·±ããã£ãŒãã£ã§ããæ·±ã2ã®èŠçŽ ã¯ããLAST(loans(MEAN(payment.payment_amount))
ã§ããããã¯ãMEANã®äžã«ããLASTïŒææ°ïŒã®2ã€ã®éèšãçµã¿åãããããšã«ãã£ãŠè¡ãããŸããããã¯ãåã¯ã©ã€ã¢ã³ãã®ææ°ã®ããŒã³ã®å¹³åæ¯æããè¡šããŸãã
æ©èœã¯ä»»æã®æ·±ããŸã§äœæã§ããŸãããå®éã«ã¯æ·±ã2ãè¶ ããããšã¯ãããŸããããã®æç¹ä»¥éãæ©èœã解éããããšã¯å°é£ã§ãããèå³ããã人ã¯ããã£ãšæ·±ããè©ŠããŠã¿ãããšããå§ãããŸãã
ããªããã£ããæåã§æå®ããå¿ èŠã¯ãããŸãããã代ããã«ãfeaturetoolsã«èªåçã«æ©èœãéžæãããããšãã§ããŸãããã®ãããåãé¢æ°åŒã³åºãã䜿çšããŸãã
ft.dfs
ãããªããã£ãã¯æž¡ããŸããã
# Perform deep feature synthesis without specifying primitives
features, feature_names = ft.dfs(entityset=es, target_entity='clients',
max_depth = 2)
features.head()
Featuretoolsã¯ç§ãã¡ã®ããã«å€ãã®æ°ããæ©èœãäœæããŸããããã®ããã»ã¹ã¯èªåçã«æ°ããç¹æ§ãäœæããŸãããããŒã¿ãµã€ãšã³ãã£ã¹ãã«ä»£ãããã®ã§ã¯ãããŸããããããã®ç¹æ§ã®ãã¹ãŠãã©ãåŠçããããç解ããå¿ èŠãããããã§ããããšãã°ã顧客ãããŒã³ãè¿æžãããã©ãããäºæž¬ããããšãç®æšã§ããå Žåãç¹å®ã®çµæã«æãé¢é£ããå åãæ¢ãå¯èœæ§ããããŸããããã«ããµããžã§ã¯ããšãªã¢ã«ã€ããŠã®ç¥èãããã°ãããã䜿çšããŠããã£ãŒãã£ã®ç¹å®ã®ããªããã£ããéžæããããåè£ãã£ãŒãã£ãæ·±ãåæãããã§ããŸãã
次ã®ã¹ããã
èªååãããæ©èœèšèšã«ããã1ã€ã®åé¡ã¯è§£æ±ºããŸããããå¥ã®åé¡ãçºçããŸãããæ©èœãå€ãããŸããã¢ãã«ããã£ããã£ã³ã°ããåã«ãããã®æ©èœã®ã©ããéèŠã§ããããèšãã®ã¯å°é£ã§ãããã»ãšãã©ã®å Žåããããã®ãã¹ãŠãã¢ãã«ããã¬ãŒãã³ã°ãããã¿ã¹ã¯ã«é¢é£ãããšã¯éããŸãããããã«ãæ©èœãå€ããããšãã¢ãã«ã®ããã©ãŒãã³ã¹ãäœäžããå¯èœæ§ããããŸããããã¯ãæçšæ§ã®äœãæ©èœãéèŠãªæ©èœãæ··éãããããã§ãã
å±æ§ãå€ãããåé¡ã¯ã次å ã®åªããšããŠç¥ãããŠããŸããã¢ãã«ã®ç¹åŸŽã®æ°ïŒããŒã¿ã®æ¬¡å ïŒãå¢ãããšãç¹åŸŽãšç®æšã®éã®å¯Ÿå¿ãç 究ããããšãé£ãããªããŸããå®éãã¢ãã«ãé©åã«æ©èœããããã«å¿ èŠãªããŒã¿éã¯ç¹åŸŽã®æ°ã«å¿ããŠææ°é¢æ°çã«ã¹ã±ãŒãªã³ã°ããŸãã
次å ã®åªãã¯ãç¹åŸŽã®åæžïŒç¹åŸŽã®éžæãšãåŒã°ããŸãïŒãšçµã¿åããããŸãïŒäžèŠãªç¹åŸŽãåé€ããããã»ã¹ãããã«ã¯å€ãã®åœ¢åŒããããŸãïŒäž»æååæïŒPCAïŒãSelectKBestãã¢ãã«ããã®ç¹åŸŽå€ã®äœ¿çšããŸãã¯ãã£ãŒããã¥ãŒã©ã«ãããã¯ãŒã¯ã䜿çšããèªåã³ãŒãã£ã³ã°ããã ããæ©èœã®åæžã¯å¥ã®èšäºã®å¥ã®ãããã¯ã§ãããã®æç¹ã§ãfeaturetoolsã䜿çšããŠãæå°éã®åŽåã§å€ãã®ããŒãã«ããå€ãã®æ©èœãäœæã§ããããšãããããŸããã
åºå
æ©æ¢°åŠç¿ã®å€ãã®ãããã¯ãšåæ§ã«ãfeaturetoolsã䜿çšããèªåæ©èœèšèšã¯ãåçŽãªã¢ã€ãã¢ã«åºã¥ãè€éãªæŠå¿µã§ãããšã³ãã£ãã£ããšã³ãã£ãã£ãããã³é¢ä¿ã®ã»ããã®æŠå¿µã䜿çšããŠãfeaturetoolsã¯ãã£ãŒããã£ãŒãã£åæãå®è¡ããŠæ°ãããã£ãŒãã£ãäœæã§ããŸãã次ã«ãæ©èœã®è©³çŽ°ãªåæã«ãããããªããã£ãïŒããŒãã«éã®1察å€ã®é¢ä¿ãä»ããŠåäœããéèšïŒãšã1ã€ã®ããŒãã«ã®1ã€ä»¥äžã®åã«é©çšãããå€æïŒé¢æ°ïŒãçµã¿åãããŠãè€æ°ã®ããŒãã«ããæ°ããæ©èœãäœæããŸãã
SkillFactoryã®ææãªã³ã©ã€ã³ã³ãŒã¹ãåè¬ããŠãã¹ãã«ãšçµŠäžã®æ³šç®ã®è·æ¥ããŒãããååŸããæ¹æ³ã®è©³çŽ°ãã芧ãã ããã
- æ©æ¢°åŠç¿ã³ãŒã¹ïŒ12é±éïŒ
- Data Science (12 )
- (9 )
- «Python -» (9 )
- DevOps (12 )
- - (8 )