首页 Big data is better data

Big data is better data

举报
开通vip

Big data is better dataBigdataisbetterdata0:11America'sfavoritepieis?0:15Audience:Apple.KennethCukier:Apple.Ofcourseitis.Howdoweknowit?Becauseofdata.Youlookatsupermarketsales.Youlookatsupermarketsalesof30-centimeterpiesthatarefrozen,andapplewins,nocontest.Themajorityofthesalesareapp...

Big data is better data
Bigdataisbetterdata0:11America'sfavoritepieis?0:15Audience:Apple.KennethCukier:Apple.Ofcourseitis.Howdoweknowit?Becauseofdata.Youlookatsupermarketsales.Youlookatsupermarketsalesof30-centimeterpiesthatarefrozen,andapplewins,nocontest.Themajorityofthesalesareapple.Butthensupermarketsstartedsellingsmaller,11-centimeterpies,andsuddenly,applefelltofourthorfifthplace.Why?Whathappened?Okay,thinkaboutit.Whenyoubuya30-centimeterpie,thewholefamilyhastoagree,andappleiseveryone'ssecondfavorite.(Laughter)Butwhenyoubuyanindividual11-centimeterpie,youcanbuytheonethatyouwant.Youcangetyourfirstchoice.Youhavemoredata.Youcanseesomethingthatyoucouldn'tseewhenyouonlyhadsmalleramountsofit.1:24Now,thepointhereisthatmoredatadoesn'tjustletusseemore,moreofthesamethingwewerelookingat.Moredataallowsustoseenew.Itallowsustoseebetter.Itallowsustoseedifferent.Inthiscase,itallowsustoseewhatAmerica'sfavoritepieis:notapple.1:49Now,youprobablyallhaveheardthetermbigdata.Infact,you'reprobablysickofhearingthetermbigdata.Itistruethatthereisalotofhypearoundtheterm,andthatisveryunfortunate,becausebigdataisanextremelyimportanttoolbywhichsocietyisgoingtoadvance.Inthepast,weusedtolookatsmalldataandthinkaboutwhatitwouldmeantotrytounderstandtheworld,andnowwehavealotmoreofit,morethanweevercouldbefore.Whatwefindisthatwhenwehavealargebodyofdata,wecanfundamentallydothingsthatwecouldn'tdowhenweonlyhadsmalleramounts.Bigdataisimportant,andbigdataisnew,andwhenyouthinkaboutit,theonlywaythisplanetisgoingtodealwithitsglobalchallenges—tofeedpeople,supplythemwithmedicalcare,supplythemwithenergy,electricity,andtomakesurethey'renotburnttoacrispbecauseofglobalwarming—isbecauseoftheeffectiveuseofdata.2:50Sowhatisnewaboutbigdata?Whatisthebigdeal?Well,toanswerthatquestion,let'sthinkaboutwhatinformationlookedlike,physicallylookedlikeinthepast.In1908,ontheislandofCrete,archaeologistsdiscoveredaclaydisc.Theydateditfrom2000B.C.,soit's4,000yearsold.Now,there'sinscriptionsonthisdisc,butweactuallydon'tknowwhatitmeans.It'sacompletemystery,butthepointisthatthisiswhatinformationusedtolooklike4,000yearsago.Thisishowsocietystoredandtransmittedinformation.3:30Now,societyhasn'tadvancedallthatmuch.Westillstoreinformationondiscs,butnowwecanstorealotmoreinformation,morethaneverbefore.Searchingitiseasier.Copyingiteasier.Sharingitiseasier.Processingitiseasier.Andwhatwecandoiswecanreusethisinformationforusesthatweneverevenimaginedwhenwefirstcollectedthedata.Inthisrespect,thedatahasgonefromastocktoaflow,fromsomethingthatisstationaryandstatictosomethingthatisfluidanddynamic.Thereis,ifyouwill,aliquiditytoinformation.ThediscthatwasdiscoveredoffofCretethat's4,000yearsold,isheavy,itdoesn'tstorealotofinformation,andthatinformationisunchangeable.Bycontrast,allofthefilesthatEdwardSnowdentookfromtheNationalSecurityAgencyintheUnitedStatesfitsonamemorystickthesizeofafingernail,anditcanbesharedatthespeedoflight.Moredata.More.4:50Now,onereasonwhywehavesomuchdataintheworldtodayiswearecollectingthingsthatwe'vealwayscollectedinformationon,butanotherreasonwhyiswe'retakingthingsthathavealwaysbeeninformationalbuthaveneverbeenrenderedintoadataformatandweareputtingitintodata.Think,forexample,thequestionoflocation.Take,forexample,MartinLuther.Ifwewantedtoknowinthe1500swhereMartinLutherwas,wewouldhavetofollowhimatalltimes,maybewithafeatheryquillandaninkwell,andrecordit,butnowthinkaboutwhatitlooksliketoday.Youknowthatsomewhere,probablyinatelecommunicationscarrier'sdatabase,thereisaspreadsheetoratleastadatabaseentrythatrecordsyourinformationofwhereyou'vebeenatalltimes.Ifyouhaveacellphone,andthatcellphonehasGPS,butevenifitdoesn'thaveGPS,itcanrecordyourinformation.Inthisrespect,locationhasbeendatafied.5:47Nowthink,forexample,oftheissueofposture,thewaythatyouareallsittingrightnow,thewaythatyousit,thewaythatyousit,thewaythatyousit.It'salldifferent,andit'safunctionofyourleglengthandyourbackandthecontoursofyourback,andifIweretoputcensors,maybe100censorsintoallofyourchairsrightnow,Icouldcreateanindexthat'sfairlyuniquetoyou,sortoflikeafingerprint,butit'snotyourfinger.6:14Sowhatcouldwedowiththis?ResearchersinTokyoareusingitasapotentialanti-theftdeviceincars.Theideaisthatthecarjackersitsbehindthewheel,triestostreamoff,butthecarrecognizesthatanon-approveddriverisbehindthewheel,andmaybetheenginejuststops,unlessyoutypeinapasswordintothedashboardtosay,"Hey,Ihaveauthorizationtodrive."Great.6:41WhatifeverysinglecarinEuropehadthistechnologyinit?Whatcouldwedothen?Maybe,ifweaggregatedthedata,maybewecouldidentifytelltalesignsthatbestpredictthatacaraccidentisgoingtotakeplaceinthenextfiveseconds.Andthenwhatwewillhavedatafiedisdriverfatigue,andtheservicewouldbewhenthecarsensesthatthepersonslumpsintothatposition,automaticallyknows,hey,setaninternalalarmthatwouldvibratethesteeringwheel,honkinsidetosay,"Hey,wakeup,paymoreattentiontotheroad."Thesearethesortsofthingswecandowhenwedatafymoreaspectsofourlives.7:28Sowhatisthevalueofbigdata?Well,thinkaboutit.Youhavemoreinformation.Youcandothingsthatyoucouldn'tdobefore.Oneofthemostimpressiveareaswherethisconceptistakingplaceisintheareaofmachinelearning.Machinelearningisabranchofartificialintelligence,whichitselfisabranchofcomputerscience.Thegeneralideaisthatinsteadofinstructingacomputerwhatdodo,wearegoingtosimplythrowdataattheproblemandtellthecomputertofigureitoutforitself.Anditwillhelpyouunderstanditbyseeingitsorigins.Inthe1950s,acomputerscientistatIBMnamedArthurSamuellikedtoplaycheckers,sohewroteacomputerprogramsohecouldplayagainstthecomputer.Heplayed.Hewon.Heplayed.Hewon.Heplayed.Hewon,becausethecomputeronlyknewwhatalegalmovewas.ArthurSamuelknewsomethingelse.ArthurSamuelknewstrategy.Sohewroteasmallsub-programalongsideitoperatinginthebackground,andallitdidwasscoretheprobabilitythatagivenboardconfigurationwouldlikelyleadtoawinningboardversusalosingboardaftereverymove.Heplaysthecomputer.Hewins.Heplaysthecomputer.Hewins.Heplaysthecomputer.Hewins.AndthenArthurSamuelleavesthecomputertoplayitself.Itplaysitself.Itcollectsmoredata.Itcollectsmoredata.Itincreasestheaccuracyofitsprediction.AndthenArthurSamuelgoesbacktothecomputerandheplaysit,andheloses,andheplaysit,andheloses,andheplaysit,andheloses,andArthurSamuelhascreatedamachinethatsurpasseshisabilityinataskthathetaughtit.9:29Andthisideaofmachinelearningisgoingeverywhere.Howdoyouthinkwehaveself-drivingcars?Areweanybetteroffasasocietyenshriningalltherulesoftheroadintosoftware?No.Memoryischeaper.No.Algorithmsarefaster.No.Processorsarebetter.No.Allofthosethingsmatter,butthat'snotwhy.It'sbecausewechangedthenatureoftheproblem.Wechangedthenatureoftheproblemfromoneinwhichwetriedtoovertlyandexplicitlyexplaintothecomputerhowtodrivetooneinwhichwesay,"Here'salotofdataaroundthevehicle.Youfigureitout.Youfigureitoutthatthatisatrafficlight,thatthattrafficlightisredandnotgreen,thatthatmeansthatyouneedtostopandnotgoforward."10:17Machinelearningisatthebasisofmanyofthethingsthatwedoonline:searchengines,Amazon'spersonalizationalgorithm,computertranslation,voicerecognitionsystems.Researchersrecentlyhavelookedatthequestionofbiopsies,cancerousbiopsies,andthey'veaskedthecomputertoidentifybylookingatthedataandsurvivalratestodeterminewhethercellsareactuallycancerousornot,andsureenough,whenyouthrowthedataatit,throughamachine-learningalgorithm,themachinewasabletoidentifythe12telltalesignsthatbestpredictthatthisbiopsyofthebreastcancercellsareindeedcancerous.Theproblem:Themedicalliteratureonlyknewnineofthem.Threeofthetraitswereonesthatpeopledidn'tneedtolookfor,butthatthemachinespotted.11:23Now,therearedarksidestobigdataaswell.Itwillimproveourlives,butthereareproblemsthatweneedtobeconsciousof,andthefirstoneistheideathatwemaybepunishedforpredictions,thatthepolicemayusebigdatafortheirpurposes,alittlebitlike"MinorityReport."Now,it'satermcalledpredictivepolicing,oralgorithmiccriminology,andtheideaisthatifwetakealotofdata,forexamplewherepastcrimeshavebeen,weknowwheretosendthepatrols.Thatmakessense,buttheproblem,ofcourse,isthatit'snotsimplygoingtostoponlocationdata,it'sgoingtogodowntotheleveloftheindividual.Whydon'tweusedataabouttheperson'shighschooltranscript?Maybeweshouldusethefactthatthey'reunemployedornot,theircreditscore,theirweb-surfingbehavior,whetherthey'reuplateatnight.TheirFitbit,whenit'sabletoidentifybiochemistries,willshowthattheyhaveaggressivethoughts.Wemayhavealgorithmsthatarelikelytopredictwhatweareabouttodo,andwemaybeheldaccountablebeforewe'veactuallyacted.Privacywasthecentralchallengeinasmalldataera.Inthebigdataage,thechallengewillbesafeguardingfreewill,moralchoice,humanvolition,humanagency.12:53Thereisanotherproblem:Bigdataisgoingtostealourjobs.Bigdataandalgorithmsaregoingtochallengewhitecollar,professionalknowledgeworkinthe21stcenturyinthesamewaythatfactoryautomationandtheassemblylinechallengedbluecollarlaborinthe20thcentury.Thinkaboutalabtechnicianwhoislookingthroughamicroscopeatacancerbiopsyanddeterminingwhetherit'scancerousornot.Thepersonwenttouniversity.Thepersonbuysproperty.Heorshevotes.Heorsheisastakeholderinsociety.Andthatperson'sjob,aswellasanentirefleetofprofessionalslikethatperson,isgoingtofindthattheirjobsareradicallychangedoractuallycompletelyeliminated.Now,weliketothinkthattechnologycreatesjobsoveraperiodoftimeafterashort,temporaryperiodofdislocation,andthatistruefortheframeofreferencewithwhichwealllive,theIndustrialRevolution,becausethat'spreciselywhathappened.Butweforgetsomethinginthatanalysis:Therearesomecategoriesofjobsthatsimplygeteliminatedandnevercomeback.TheIndustrialRevolutionwasn'tverygoodifyouwereahorse.Sowe'regoingtoneedtobecarefulandtakebigdataandadjustitforourneeds,ourveryhumanneeds.Wehavetobethemasterofthistechnology,notitsservant.Wearejustattheoutsetofthebigdataera,andhonestly,wearenotverygoodathandlingallthedatathatwecannowcollect.It'snotjustaproblemfortheNationalSecurityAgency.Businessescollectlotsofdata,andtheymisuseittoo,andweneedtogetbetteratthis,andthiswilltaketime.It'salittlebitlikethechallengethatwasfacedbyprimitivemanandfire.Thisisatool,butthisisatoolthat,unlesswe'recareful,willburnus.14:55Bigdataisgoingtotransformhowwelive,howweworkandhowwethink.Itisgoingtohelpusmanageourcareersandleadlivesofsatisfactionandhopeandhappinessandhealth,butinthepast,we'veoftenlookedatinformationtechnologyandoureyeshaveonlyseentheT,thetechnology,thehardware,becausethat'swhatwasphysical.WenowneedtorecastourgazeattheI,theinformation,whichislessapparent,butinsomewaysalotmoreimportant.Humanitycanfinallylearnfromtheinformationthatitcancollect,aspartofourtimelessquesttounderstandtheworldandourplaceinit,andthat'swhybigdataisabigdeal.15:45(Applause)
本文档为【Big data is better data】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。
下载需要: 免费 已有0 人下载
最新资料
资料动态
专题动态
is_321635
暂无简介~
格式:doc
大小:40KB
软件:Word
页数:0
分类:
上传时间:2021-05-24
浏览量:134