Want to tap the power behind search rankings, product recommendations, social bookmarking, and online matchmaking? This fascinating book demonstrates how you can build Web 2.0 applications to mine the enormous amount of data created by people on the Internet. With the sophisticated algorithms in this book, you can write smart programs to access interesting datasets from other web sites, collect data from users of your own applications, and analyze and understand the data once you've found it.Programming Collective Intelligence takes you into the world of machine learning and statistics, and explains how to draw conclusions about user experience, marketing, personal tastes, and human behavior in general -- all from information that you and others collect every day. Each algorithm is described clearly and concisely with code that can immediately be used on your web site, blog, Wiki, or specialized application. This book explains:* Collaborative filtering techniques that enable online retailers to recommend products or media * Methods of clustering to detect groups of similar items in a large dataset * Search engine features -- crawlers, indexers, query engines, and the PageRank algorithm * Optimization algorithms that search millions of possible solutions to a problem and choose the best one * Bayesian filtering, used in spam filters for classifying documents based on word types and other features * Using decision trees not only to make predictions, but to model the way decisions are made * Predicting numerical values rather than classifications to build price models * Support vector machines to match people in online dating sites * Non-negative matrix factorization to find the independent features in a dataset * Evolving intelligence for problem solving -- how a computer develops its skill by improving its own code the more it plays a gameEach chapter includes exercises for extending the algorithms to make them more powerful. Go beyond simple database-backed applications and put the wealth of Internet data to work for you. "Bravo! I cannot think of a better way for a developer to first learn these algorithms and methods, nor can I think of a better way for me (an old AI dog) to reinvigorate my knowledge of the details."-- Dan Russell, Google "Toby's book does a great job of breaking down the complex subject matter of machine-learning algorithms into practical, easy-to-understand examples that can be directly applied to analysis of social interaction across the Web today.If I had this book two years ago, it would have saved precious time going down some fruitless paths." -- Tim Wolters, CTO, Collective Intellect
Toby Segaran works as a Data Magnate at Metaweb Technologies. Prior to working at Metaweb, he started a biotech software company called Incellico which was later acquired by Genstruct. His book, "Programming Collective Intelligence" has been the best-selling AI book on Amazon for several months. He is the recipient of a National Interest Waiver for "People of Exceptional Abilit...
(展开全部)
Next,getalistofrandompeopletomakeupthedataset.Fortunately,HotorNotprovidesanAPIcallthatreturnsalistofpeoplewithspecifiedcriteria.Inthisexam-ple,theonlycriteriawillbethatthepeoplehave“meetme”profiles,sinceonlyfromtheseprofilescanyougetotherinformationlikelocationandinterests.Addthisfunctiontohotornot.py:
——引自第162页
WhatDoesThisHavetoDowiththeArticlesMatrix?Sofar,whatyouhaveisamatrixofarticleswithwordcounts.Thegoalistofactorizethismatrix,whichmeansfindingtwosmallermatricesthatcanbemultipliedtogethertoreconstructthisone.Thetwosmallermatricesare:ThefeaturesmatrixThismatrixhasarowforeachfeatureandacolumnforeachword.Thevaluesindicatehowimportantawordistoafeature.Eachfeatureshouldrepresentathemethatemergedfromasetofarticles,soyoumightexpectanarticleaboutanewTVshowtohaveahighweightfortheword“television.”TheweightsmatrixThismatrixmapsthefeaturestothearticlesmatrix.Eachrowisanarticleandeachcolumnisafeature.Thevaluesstatehowmucheachfeatureappliestoeacharticl...
——引自第234页
※一部纵横两百五十年、跨越世界各大洲的女性主义全球发展史※※呈现身处不同时空的各色“女性主义者”关于平等与自由的梦想和行动※《今日历史》2020年度历史图书 |...
作为一部帮助大家实现微服务架构落地的作品,《SpringCloud与Docker微服务架构实战》覆盖了微服务理论、微服务开发框架(SpringC
高铁时代的城市发展与规划 内容简介 本书对目前国内外高铁对城市发展的影响的有关研究进行了全面系统的综述,系统分析总结了国外高铁建设对城市发展的影响,探讨了我国高...
读了《解忧杂货店》意犹未尽的话,来读《时生》吧,会给你更多温暖与感动东野圭吾暖心力作给现代人久违的触动:你从没觉得能来到这世上真好吗?日本万名东野粉丝票选东野作...
古龙(1937-1985),本名熊耀华,祖籍江西。14岁时随父母从香港移居台湾读书,不久因父母离异生活陷于困境,靠朋友接济和半工半读念完淡江大学外文系。毕业后,...
【编辑推荐】★第三辑《福桃 主厨》如约而至!超越食记食谱!结合旅游、散文、艺术、摄影、漫画、创意料理、专题报导、饮食文学,谈天说地大鸣大放!★本辑的主题是“主厨...
凯特·T·帕克(Kate T. Parker),美国专业摄影师,为北美各地的客户完成拍摄艺术项目和商业作品。拍照之余,凯特会为女儿足球队的做教练。作者目前和家人...
美国普利策诗歌奖得主史蒂文斯最全面的诗文集,当代著名诗人、翻译家马永波十年精心译作。本书是国内迄今关于华莱士•史蒂文斯的最为全面的诗文集。本书由诗歌、文论、随笔...
10岁的小米,给世界讲了一个怎样的故事?这是关于一群人的故事。这10年是雷军作为创业者升级蜕变的10年,是小米数万员工一往无前的10年,也是跟随小米一路走来的一...
ACSM运动测试与运动处方指南-第8版 本书特色 《ACSM运动测试与运动处方指南(第8版)》是由人民卫生出版社出版的。ACSM运动测试与运动处方指南-第8版 ...
★ 钱乘旦等教授推荐,一部19世纪晚期世界的写真集,写出了一个大时代。★ 在19世纪的殖民浪潮中,一个太平洋岛国的自我拯救。★ 第一位环球旅行的君主,一位被马克...
作品目录斜坡上的工作室 弗吉尼亚·阿姆斯特朗古董与摩登兼备的世界 阿比盖尔·布朗来自于爱女的灵感创意 娜塔莉·杜科缤纷多彩
美国立国先驱,思想家、作家、政治活动家。生于英国诺福克郡,曾做过裁缝、教员、税吏,屡遭失业和饥饿的威胁。移居英属北美殖民地后,积极号召并躬身参加美国独立运动,为...
乔治•奥威尔(George Orwell,1903年—1950年),英国小说家、散文家、记者和评论家。出生于印度,受教于英国伊顿公学,在缅甸当过警察,参加过西班...
ThisisthefirstpublicationinEnglishofFranzRosenzweigs1927translationofandcommenta...
orie日本廣島縣出身。繪畫技巧皆是自學而成。曾參與的工作包括書籍封面插畫、MV插畫、專輯藝術、角色設計、商品販售等等。創作主題是能靜靜陪伴人們度過孤獨夜晚的作...
伤寒六经传变与仲景方拾遗 内容简介 本书共分三章, 分别为: 伤寒六经的主证与主脉、六经传变、简介白云阁藏本。不但介绍了六经辨证与六经的主证、适应证、禁忌证、用...
曹存心医学全书 内容简介 《曹存心医学全书》共收集10部清朝医书,它们分别是《琉球百问》、《琉球问答奇病论》、《继志堂语录》、《曹仁伯先生医说》、《增订医方歌诀...
养猪窍门百问百答 本书特色 《养猪窍门百问百答》:专家指点迷津、尽释技术关键、引领时代潮流、培养养殖能手。国家重点图书。养猪窍门百问百答 内容简介 简介《养猪窍...
江晓原,现任上海交通大学特聘教授、博士生导师、科学史系主任,上海科学技术史学会理事长,中国性学会常务理事,上海性教育协会副会长。中国科学技术史学会前副理事长。1...