ããã¯äœç³»çãªåæã§ã¯ãªãã衚ã§ããããŸãããå°çç©çåŠè ã®èгç¹ããããåã ã®èŠç¹ãããããç§ã¯åžžã«Gartner MQãèªã¿ãããšæã£ãŠããŸãã圌ãã¯ãããã€ãã®ãã€ã³ããå®ç§ã«å®åŒåããŠããŸããããã§ãããã«ç§ãæè¡çãåžå ŽçãããŠå²åŠçãªèгç¹ããæ³šæãæã£ããã®ããããŸãã
ããã¯ãMLã«æ·±ãé¢å¿ã®ãã人åãã§ã¯ãªããåžå Žã§äžè¬çã«èµ·ãã£ãŠããããšã«é¢å¿ã®ãã人åãã§ãã
DSMLåžå Žèªäœã¯ãBIãšCloudAIéçºè ãµãŒãã¹ã®éã«è«ççã«ãã¹ãããŠããŸãã
æåã®åŒçšãšçšèªãæ°ã«å ¥ããŸããïŒ
- ããªãŒããŒã¯æè¯ã®éžæã§ã¯ãªããããããŸããã -ããŒã±ãããªãŒããŒã¯å¿ ãããããªããå¿ èŠãšãããã®ã§ã¯ãããŸãããéåžžã«ç·æ¥ïŒæ©èœçãªé¡§å®¢ãäžè¶³ããŠããããã圌ãã¯åžžã«ãé©åãªããœãªã¥ãŒã·ã§ã³ã§ã¯ãªãããæè¯ã®ããœãªã¥ãŒã·ã§ã³ãæ¢ããŠããŸãã
- ã¢ãã«ã®éçšåã¯MOPãšç¥ãããŸãããããŠããã°ã¯èª°ã«ãšã£ãŠãé£ããã§ãïŒ-ïŒã¯ãŒã«ãªãã°ããŒãã¯ã¢ãã«ãæ©èœãããŸãïŒã
- ããŒãããã¯ç°å¢ã¯ãã³ãŒããã³ã¡ã³ããããŒã¿ãããã³çµæããŸãšããããéèŠãªæŠå¿µã§ããããã¯éåžžã«æç¢ºã§ææã§ãããUIã³ãŒãã®éãå€§å¹ ã«æžããããšãã§ããŸãã
- «Rooted in OpenSource» â â .
- «Citizen Data Scientists» â , , , . .
- «Democratise» â â â. «democratise the data» «free the data», . «Democratise» â long tail . â !
- «Exploratory Data Analysis â EDA» â . . . , . ,
- ãåçŸæ§ã -ç°å¢ãå ¥åãåºåã®ãã¹ãŠã®ãã©ã¡ãŒã¿ãŒãæå€§éã«ä¿æãããããäžåºŠå®è¡ãããšå®éšãç¹°ãè¿ãããšãã§ããŸããå®éšçãªãã¹ãç°å¢ã®æãéèŠãªçšèªïŒ
ããïŒ
Alteryx
ã¯ãŒã«ãªã€ã³ã¿ãŒãã§ãŒã¹ã¯åãªãããã¡ãã§ãããã¡ãããã¹ã±ãŒã©ããªãã£ã¯å°ããã€ãã§ãããããã£ãŠãtsatskiãšåãåšãã®ãšã³ãžãã¢ã®åžæ°ã³ãã¥ããã£ããã¬ã€ããŸããAnalyticsã«ã¯ãç¬èªã®ãªãŒã«ã€ã³ã¯ã³ããã«ããããŸãã90幎代ã«ããã°ã©ã ãããCoscadSpectral Correlation DataAnalysisã¹ã€ãŒããæãåºããŸããã
ã¢ãã³ã³ã
PythonãšRã®å°éå®¶ãäžå¿ãšããã³ãã¥ããã£ããªãŒãã³ãœãŒã¹ã¯ãããã倧ããã§ããç§ã®ååã¯åžžã«äœ¿çšããŠããããšãããããŸãããç¥ããŸããã§ããã
DataBricks
3ã€ã®ãªãŒãã³ãœãŒã¹ãããžã§ã¯ãã§æ§æãããŠããŸã-Sparkéçºè ã¯2013幎以æ¥ãè«å€§ãªè³éã調éããŠããŸããç§ã¯wikiãçŽæ¥èªãå¿ èŠããããŸãã
ã2013幎9æãDatabricksã¯AndreessenHorowitzãã1390äžãã«ã調éãããšçºè¡šããŸãããå瀟ã¯ã2014幎ã«3,300äžãã«ã2016幎ã«6,000äžãã«ã2017幎ã«1å4,000äžãã«ã2019幎ïŒ2æïŒã«2å5,000äžãã«ã2019幎ïŒ10æïŒã«4åãã«ã远å ã§èª¿éããŸãããã¹ããŒã¯ãèŠãçŽ æŽããã人ã ããªãã¿ããªãããããªããïŒ
ãããŠãããžã§ã¯ãã¯æ¬¡ã®ãšããã§ãã
- Delta Lake -ACID on Sparkãæè¿ãªãªãŒã¹ãããŸããïŒElasticsearchã§å€¢èŠãŠãããã®ïŒ-ãããããŒã¿ããŒã¹ã«å€æããŸãïŒå³å¯ãªã¹ããŒã ãACIDãç£æ»ãããŒãžã§ã³...
- MLãããŒ-ã¢ãã«ã®è¿œè·¡ãããã±ãŒãžåã管çãããã³ä¿ç®¡ã
- Koalas -Sparkäžã®PandasDataFrameAPI-Pandas-äžè¬ã«ããŒãã«ãšããŒã¿ãæäœããããã®PythonAPIã
çªç¶ç¥ããªãããŸãã¯å¿ããŠããŸã£ãSparkã«ã€ããŠèŠãããšãã§ããŸãïŒlinkãVidosikiã¯ãå°ãéå±ã§è©³çްãªã³ã³ãµã«ãã£ã³ã°ã®ããããã®äŸãèŠãŠã¿ãŸãããDataScienceçšã®DataBricksïŒãªã³ã¯ïŒãšData EngineeringçšïŒãªã³ã¯ïŒã§ãã
ã€ãŸããDatabricksã¯SparkãåŒãåºããŸããã¯ã©ãŠãã§éåžžSparkã䜿çšããã人ã¯ãæå³ãããšãããããããããšãªãDataBricksã䜿çšããŸã:)ããã§ã®äž»ãªå·®å¥åèŠå ã¯Sparkã§ãã
SparkStreamingã¯æ¬ç©ã®åœã®ãªã¢ã«ã¿ã€ã ãŸãã¯ãã€ã¯ããããåŠçã§ã¯ãªãããšãããããŸããããŸããå®éã®ãªã¢ã«ã¿ã€ã ãå¿ èŠãªå Žåã¯ãApacheSTORMã«ãããŸããããã§ã誰ããSparkã¯MapReduceãããã¯ãŒã«ã ãšèšã£ããæžãããããŠããŸããã¹ããŒã¬ã³ã¯ããã§ãã
DATAIKU
ã¯ãŒã«ãªãšã³ãããŒãšã³ãã®ãã®ãããããã®åºåããããŸããAlteryxãšã®éããããããŸãããïŒ
DataRobot
ããŒã¿ãæºåããããã®Paxataã¯ã2019幎12æã«DateRobotsã«è²·åãããå¥ã®äŒç€Ÿã§ãã 20 MUSDã調éãã販売ããŸããã 7幎ã§ãã¹ãŠã
Excelã§ã¯ãªãPaxataã§ããŒã¿ãæºåããŠããŸã-ãããåç §ããŠãã ããïŒlinkã
2ã€ã®ããŒã¿ã»ããéã«èªåã¹ããŒãã£ã³ã°ãšçµåææ¡ããããŸããçŽ æŽãããããš-ããŒã¿ãæŽçããããã«ãããã¹ãæ å ±ã«ããã«éç¹ã眮ããŸãïŒãªã³ã¯ïŒã
ããŒã¿ã«ã¿ãã°ã¯ã誰ãå¿ èŠãšããªããã©ã€ããããŒã¿ã»ããã®åªããã«ã¿ãã°ã§ãã
Paxataã§ã«ã¿ãã°ãã©ã®ããã«åœ¢æãããããè峿·±ãã§ãïŒãªã³ã¯ïŒã
«According to analyst firm Ovum, the software is made possible through advances in predictive analytics, machine learning and the NoSQL data caching methodology.[15] The software uses semantic algorithms to understand the meaning of a data table's columns and pattern recognition algorithms to find potential duplicates in a data-set.[15][7] It also uses indexing, text pattern recognition and other technologies traditionally found in social media and search software.»
ããŒã¿ããããã®äž»ãªè£œåã¯ãã¡ãã§ãã圌ãã®ã¹ããŒã¬ã³ã¯ã¢ãã«ããäŒæ¥ã¢ããªã±ãŒã·ã§ã³ãŸã§ã§ãïŒå±æ©ã«é¢é£ããŠç³æ²¹æ¥çåãã®ã³ã³ãµã«ãã£ã³ã°ãçºèŠããŸããããéåžžã«å¹³å¡ã§é¢çœããããŸããïŒãªã³ã¯ã MopsãŸãã¯MLopsã§åœŒãã®ãããªãèŠãŸããïŒãªã³ã¯ïŒãããã¯ãããŸããŸãªè£œåã®6ã7åã®è²·åã§æ§æããããã©ã³ã±ã³ã·ã¥ã¿ã€ã³ã§ãã
ãã¡ãããããŒã¿ãµã€ãšã³ãã£ã¹ãã®å€§èŠæš¡ãªããŒã ã¯ãã¢ãã«ãæäœããããã®ãŸãã«ãã®ãããªç°å¢ãåããŠããå¿ èŠãããããšãæããã«ãªããŸããããããªããšãã¢ãã«ã®å€ããçæãããäœãå±éãããŸããããããŠãç§ãã¡ã®ç³æ²¹ãšã¬ã¹ã®äžæµã®çŸå®ã§ã¯ã1ã€ã®ã¢ãã«ãæ£åžžã«äœæããããšãã§ããããã¯ãã§ã«å€§ããªé²æ©ã§ãïŒ
ããã»ã¹èªäœã¯ãå°è³ªåŠã«ãããèšèšã·ã¹ãã ã®äœæ¥ãéåžžã«åœ·åœ¿ãšãããŸã-å°çç©çåŠãäŸãã°ãããã¬ã«..ããã¹ãŠã®é貚ãã¢ãã«ãäœæããã³å€æŽããŸããã¢ãã«ã«ããŒã¿ãåéããŸãããããã圌ãã¯ãªãã¡ã¬ã³ã¹ã¢ãã«ãäœãããããçç£ã«æå ¥ããŸããïŒããšãã°ãå°è³ªåŠçã¢ãã«ãšMLã¢ãã«ã®éã«ã¯å€ãã®é¡äŒŒç¹ããããŸãã
ããã
ãªãŒãã³ãã©ãããã©ãŒã ãšã³ã©ãã¬ãŒã·ã§ã³ã«éç¹ã眮ããŠããŸããããžãã¹ãŠãŒã¶ãŒã¯ç¡æã§å ¥å Žã§ããŸãã圌ãã®ããŒã¿ã©ãã¯ã·ã§ã¢ãã€ã³ãã«éåžžã«äŒŒãŠããŸãã ïŒãããŠãã®ååããIBMã«åŒ·ãäžããŸãïŒããã¹ãŠã®å®éšã¯å ã®ããŒã¿ã»ããã«ãªã³ã¯ãããŠããŸããã©ãã»ã©éŠŽæã¿ãããã:)ç§ãã¡ã®å®è·µã®ããã«-ããã€ãã®ããŒã¿ãã¢ãã«ã«ãã©ãã°ãããæ¬¡ã«ãããã¯ãªãŒã³ã¢ãããããŠã¢ãã«ã«æŽçãããŸããããããŠããã¯ãã¹ãŠã¢ãã«ã«ãã§ã«ååšããåæããŒã¿ã§çµãããèŠã€ããããšãã§ããŸããã
Dominoã«ã¯ã¯ãŒã«ãªã€ã³ãã©ã¹ãã©ã¯ãã£ä»®æ³åããããŸããç§ã¯ãã·ã³ã«1ç§ãããã®ã³ã¢æ°ãåéããã«ãŠã³ãããŸããããããã©ã®ããã«è¡ããããã¯ãããã«ã¯å®å šã«ã¯æããã§ã¯ãããŸãããã©ãã§ãDockerãããããã®èªç±ïŒææ°ããŒãžã§ã³ã®ä»»æã®ã¯ãŒã¯ã¹ããŒã¹ã«æ¥ç¶ã§ããŸãã䞊è¡ããŠå®éšãå®è¡ããŸããæåãããã®ã®è¿œè·¡ãšéžæã
DataRobotãšåã-çµæã¯ã¢ããªã±ãŒã·ã§ã³ã®åœ¢ã§ããžãã¹ãŠãŒã¶ãŒåãã«å ¬éãããŸããç¹ã«æèœã®ãããå©å®³é¢ä¿è ãã®ããã«ããŸããã¢ãã«ã®å®éã®äœ¿çšãç£èŠãããŸãããã°ã®ããã®ãã¹ãŠïŒ
è€éãªã¢ãã«ãã©ã®ããã«çç£ãããã®ãå®å šã«ã¯çè§£ããŠããŸããã§ããããããã«ããŒã¿ãäŸçµŠããŠçµæãååŸããããã«ãããã€ãã®APIãæäŸãããŠããŸãã
H2O
ãã©ã€ãã¬ã¹AIã¯ãç£èŠå¯Ÿè±¡MLåãã®éåžžã«ã³ã³ãã¯ãã§ããããããã·ã¹ãã ã§ãã 1ã€ã®ããã¯ã¹ã«ãã¹ãŠãããã¯ãšã³ãã«ã€ããŠããã«æç¢ºã§ã¯ãããŸããã
ã¢ãã«ã¯èªåçã«RESTãµãŒããŒãŸãã¯Javaã¢ããªã«ããã±ãŒãžåãããŸããããã¯çŽ æŽãããã¢ã€ãã¢ã§ããè§£éå¯èœæ§ãšèª¬æå¯èœæ§ã«ã€ããŠã¯å€ãã®ããšãè¡ãããŠããŸãããã¢ãã«ã®æäœçµæã®è§£éãšèª¬æïŒæ¬è³ªçã«èª¬æã§ããªããã®ã¯äœã§ãããããã§ãªããã°äººã¯åãããšãèšç®ã§ããŸããïŒïŒã
åããŠãéæ§é åããŒã¿ãšNLPã«é¢ããã±ãŒã¹ã¹ã¿ãã£ãè©³çŽ°ã«æ€èšãããŸããé«å質ã®å»ºç¯åçãäžè¬çã«ãç§ã¯åçã奜ãã§ããã
å®å šã«ã¯æç¢ºã§ã¯ãªãå€§èŠæš¡ãªãªãŒãã³ãœãŒã¹H2Oãã¬ãŒã ã¯ãŒã¯ããããŸãïŒäžé£ã®ã¢ã«ãŽãªãºã /ã©ã€ãã©ãªïŒïŒããžã¥ãã¿ãŒã®ãããªããã°ã©ãã³ã°ãªãã§èªåã®ããžã¥ã¢ã«ã©ãããããïŒãªã³ã¯ïŒãPojoãšMojoã«ã€ããŠãèªã¿ãŸãã-çŸå®ã«å ãŸããH2Oã¢ãã«ã1ã€ç®ã¯é¡ã«ããã2ã€ç®ã¯æé©åã§ããGartnerãããã¹ãåæãšNLPãé·æãšããŠããŸã説æå¯èœæ§ã®åãçµã¿ã«ã€ããŠæžããã®ã¯H20ã ãã§ãïŒïŒïŒãããã¯éåžžã«éèŠã§ãïŒ
åäžïŒéãšé²ã®çµ±åã®ããã®é«æ§èœãæé©åãããã³æ¥çæšæºã
ãããŠãããã¯åŒ±ç¹ã§è«ççã§ã-Driverles AIã¯ãç¬èªã®ãªãŒãã³ãœãŒã¹ãšæ¯èŒããŠåŒ±ããŠçãã§ããåãPaxataãšæ¯èŒãããšãããŒã¿ã®æºåãäžååã§ãããããŠãç£æ¥ããŒã¿ïŒã¹ããªãŒã ãã°ã©ããå°çïŒãç¡èŠããŸãããŸãããã¹ãŠãæ£ããããã§ã¯ãããŸããã
KNIME
ããŒã ããŒãžã«ãã6ã€ã®éåžžã«å ·äœçãªéåžžã«è峿·±ãããžãã¹ã±ãŒã¹ãæ°ã«å ¥ããŸããã匷åãªãªãŒãã³ãœãŒã¹ã
ã¬ãŒãããŒã¯ãªãŒããŒããããžã§ããªãŒã«ææ ŒããŸããããªãŒããŒãåžžã«æè¯ã®éžæã§ãããšã¯éããªãããšãèãããšãè²§ãããéã皌ãããšã¯ãŠãŒã¶ãŒã«ãšã£ãŠè¯ãå åã§ãã
ããŒã¯ãŒãã¯H2Oãšåãã§ã-æ¡åŒµããããããã¯è²§ããåžæ°ã®ããŒã¿ç§åŠè ãå©ããããšãæå³ããŸããã¬ãã¥ãŒã§ããã©ãŒãã³ã¹ã§å±ãããã®ã¯ãããåããŠã§ãïŒé¢çœãïŒã€ãŸããèšç®èœåãéåžžã«é«ããããããã©ãŒãã³ã¹ãã·ã¹ãã äžã®åé¡ã«ãªãããšã¯ãããŸãããïŒGartnerã«ã¯ããã®åèªãAugmentedãã«é¢ããå¥ã®èšäºããããŸãããç§ã¯ãããç¥ãããšãã§ããŸããã§ããã
ãããŠãKNIMEã¯ã¬ãã¥ãŒã§æåã®éã¢ã¡ãªã«äººã®ããã§ãïŒïŒãããŠç§ãã¡ã®ãã¶ã€ããŒã¯åœŒãã®ã©ã³ãã£ã³ã°ããŒãžãæ¬åœã«å¥œãã§ãããå¥åŠãªäººã ã
MathWorks
MatLabã¯ã誰ããç¥ã£ãŠããå€ãããã®åèªããå人ã§ããçæŽ»ãšç¶æ³ã®ãã¹ãŠã®åéã®ããã®ããŒã«ããã¯ã¹ãéåžžã«ç°ãªãäœããå®éãäžè¬çã«ãã¹ãŠã®æ©äŒã®ããã«ããããããããããããããã®æ°åŠïŒ
ã·ã¹ãã èšèšçšã®Simulinkã¢ããªã³è£œåãDigital Twinsã®ããŒã«ããã¯ã¹ãæãäžããŸãã-ããã«ã€ããŠã¯äœãçè§£ããŠããŸããããããã«ã¯ããããã®ããšãæžãããŠããŸãã以äžã®ããã®ç³æ²¹ç£æ¥ãäžè¬çã«ãããã¯æ°åŠãå·¥åŠã®æ·±ããšã¯æ ¹æ¬çã«ç°ãªã補åã§ããç¹å®ã®æ°åŠããŒã«ããããéžæããŸããã¬ãŒãããŒã«ããã°ã圌ãã¯çãè³¢ããšã³ãžãã¢ã®ãããªåé¡ãæ±ããŠããŸã-ã³ã©ãã¬ãŒã·ã§ã³ããããŸãã-ããããã圌ã®ã¢ãã«ã®äžã§æŽãåã£ãŠããŠãæ°äž»äž»çŸ©ãæŸåå¯èœæ§ããããŸããã
RapidMiner
ç§ã¯ä»¥åãïŒMatlabãšäžç·ã«ïŒåªãããªãŒãã³ãœãŒã¹ã®ã³ã³ããã¹ãã§å€ãã®ããšãç¥ããèããããšããããŸãããã€ãã®ããã«TurboPrepã«å°ãåããŸãããããŒãã£ããŒã¿ããã¯ãªãŒã³ãªããŒã¿ãååŸããæ¹æ³ã«èå³ããããŸãã
ç¹°ãè¿ãã«ãªããŸãããæ©èœã®ãã¢ã§ã¯ã2018幎ã®ããŒã±ãã£ã³ã°è³æãšã²ã©ãè±èªã話ã人ãåªããŠããããšãããããŸãã
ãããŠ2001幎以æ¥ããã€ãã®åŒ·ãéå»ãæã€ãã«ãã ã³ãã®äººã ïŒ

ãªãŒãã³ãœãŒã¹ã§æ£ç¢ºã«äœãå©çšã§ããã®ãããµã€ãããã¯ããããŸããã§ãããããã«æ·±ãæãäžããå¿ èŠããããŸããå±éãšAutoMLã®æŠå¿µã«é¢ããåªãããããªã
RapidMinerServerããã¯ãšã³ãã«ã€ããŠãç¹å¥ãªããšã¯äœããããŸãããããã¯ããããã³ã³ãã¯ãã§ãç®±ããåºããŠããã«æ§å ã§ããŸãæ©èœããŸãã Dockerã«ããã±ãŒãžåãããŠããŸãã RapidMinerãµãŒããŒã®ã¿ã®å ±æç°å¢ããããŠãã¹ã¿ãžãªã¯ãŒã¯ãããŒã§Sparkããã®ãªãºã ãæ°ããRadoopãhadupããã®ããŒã¿ããããŸãã
ããããªè¥ããã³ããŒãã¹ãã©ã€ãã¹ãã£ãã¯ã»ã©ãŒããæåŸ ããããã«ããããæŒãäžããŸããããã ããGartnerã¯ããšã³ã¿ãŒãã©ã€ãºåéã§ã®å°æ¥ã®æåãäºæž¬ããŠããŸããããªãã¯ããã§ãéãéããããšãã§ããŸãããã€ã人ã¯ã©ãã»ã©èãªããã®ããç¥ã£ãŠããŸãïŒïŒSAPã«ã€ããŠã¯èšåããªãã§ãã ãã!!!
圌ãã¯åžæ°ã®ããã«ããããã®ããšãããŸãïŒãããããã®ããŒãžã§ã¯ãGartnerãè²©å£²é©æ°ã«èŠåŽããŠããã察象ç¯å²ã®åºãã§ã¯ãªãåçæ§ãæ±ããŠæŠã£ãŠãããšèšã£ãŠããããšãããããŸãã
å·ŠSASããã³Tibcoã®äžè¬çãªBIã¯ç§ã®ããã«ãã³ããŒ...ãããŠãäž¡æ¹ãæ£åžžãªããŒã¿ãµã€ãšã³ã¹ã¯ãè«ççã«æé·ããããšããç§ã®ä¿¡å¿µã確èªãããããã«ãã
BIããã§ã¯ãªããé²ãšã®Hadoopã€ã³ãã©ã®å€ã«ãããžãã¹ãããã€ãŸãITããã§ã¯ãããŸãããGazpromneftã®äŸïŒlinkã®ããã«ãæçããDSMLç°å¢ã¯å å®ãªBIãã©ã¯ãã£ã¹ããæé·ããŸãããããããããã圌女ã¯MDMããã®ä»ã®ããšã«æ±æãšåèŠãæã£ãŠããŸãã
SAS
èšãããšã¯ããŸããããŸãããæçœãªããšã ãã
TIBCO
æŠç¥ã¯ã1ããŒãžã«åã¶WikiããŒãžã®ã·ã§ããã³ã°ãªã¹ãã§èªã¿åãããŸããã¯ããé·ã話ã§ããã28 !!! ãã£ãŒã«ãºãç§ã®ãã¯ãã®è¥è ã«BISpotfireïŒ2007ïŒãè²·åããŸããããŸããJaspersoftïŒ2014ïŒãInsightfulïŒS-plusïŒïŒ2008ïŒãStatisticaïŒ2017ïŒãAlpine DataïŒ2017ïŒãã€ãã³ãåŠçããã³ã¹ããªãŒãã³ã°Streambase SystemïŒ2013ïŒãMDM Orchestra NetworksïŒ2018ïŒã®3ã€ã®äºæž¬åæãã³ããŒã«ããã¬ããŒãããããŸãã ïŒããã³Snappy DataïŒ2019ïŒã€ã³ã¡ã¢ãªãã©ãããã©ãŒã ã
ããã«ã¡ã¯ãã©ã³ããŒïŒ
