Want to tap the power behind search rankings, product recommendations, social bookmarking, and online matchmaking? This fascinating book demonstrates how you can build Web 2.0 applications to mine the enormous amount of data created by people on the Internet. With the sophisticated algorithms in this book, you can write smart programs to access interesting datasets from other web sites, collect data from users of your own applications, and analyze and understand the data once you've found it.Programming Collective Intelligence takes you into the world of machine learning and statistics, and explains how to draw conclusions about user experience, marketing, personal tastes, and human behavior in general -- all from information that you and others collect every day. Each algorithm is described clearly and concisely with code that can immediately be used on your web site, blog, Wiki, or specialized application. This book explains:* Collaborative filtering techniques that enable online retailers to recommend products or media * Methods of clustering to detect groups of similar items in a large dataset * Search engine features -- crawlers, indexers, query engines, and the PageRank algorithm * Optimization algorithms that search millions of possible solutions to a problem and choose the best one * Bayesian filtering, used in spam filters for classifying documents based on word types and other features * Using decision trees not only to make predictions, but to model the way decisions are made * Predicting numerical values rather than classifications to build price models * Support vector machines to match people in online dating sites * Non-negative matrix factorization to find the independent features in a dataset * Evolving intelligence for problem solving -- how a computer develops its skill by improving its own code the more it plays a gameEach chapter includes exercises for extending the algorithms to make them more powerful. Go beyond simple database-backed applications and put the wealth of Internet data to work for you. "Bravo! I cannot think of a better way for a developer to first learn these algorithms and methods, nor can I think of a better way for me (an old AI dog) to reinvigorate my knowledge of the details."-- Dan Russell, Google "Toby's book does a great job of breaking down the complex subject matter of machine-learning algorithms into practical, easy-to-understand examples that can be directly applied to analysis of social interaction across the Web today.If I had this book two years ago, it would have saved precious time going down some fruitless paths." -- Tim Wolters, CTO, Collective Intellect
Toby Segaran works as a Data Magnate at Metaweb Technologies. Prior to working at Metaweb, he started a biotech software company called Incellico which was later acquired by Genstruct. His book, "Programming Collective Intelligence" has been the best-selling AI book on Amazon for several months. He is the recipient of a National Interest Waiver for "People of Exceptional Abilit...
(展开全部)
Next,getalistofrandompeopletomakeupthedataset.Fortunately,HotorNotprovidesanAPIcallthatreturnsalistofpeoplewithspecifiedcriteria.Inthisexam-ple,theonlycriteriawillbethatthepeoplehave“meetme”profiles,sinceonlyfromtheseprofilescanyougetotherinformationlikelocationandinterests.Addthisfunctiontohotornot.py:
——引自第162页
WhatDoesThisHavetoDowiththeArticlesMatrix?Sofar,whatyouhaveisamatrixofarticleswithwordcounts.Thegoalistofactorizethismatrix,whichmeansfindingtwosmallermatricesthatcanbemultipliedtogethertoreconstructthisone.Thetwosmallermatricesare:ThefeaturesmatrixThismatrixhasarowforeachfeatureandacolumnforeachword.Thevaluesindicatehowimportantawordistoafeature.Eachfeatureshouldrepresentathemethatemergedfromasetofarticles,soyoumightexpectanarticleaboutanewTVshowtohaveahighweightfortheword“television.”TheweightsmatrixThismatrixmapsthefeaturestothearticlesmatrix.Eachrowisanarticleandeachcolumnisafeature.Thevaluesstatehowmucheachfeatureappliestoeacharticl...
——引自第234页
2016-临床执业助理医师考点精粹掌中宝-新大纲版 本书特色 《中公版·2016国家医师资格考试辅导用书:临床执业医师考点精粹掌中宝(新大纲版)》...
妇科疾病针灸处方手册 内容简介 本书介绍了治疗妇科疾病的针灸配方与操作方法,是临床运用针灸治疗妇科疾病所不可缺少的。妇科疾病针灸处方手册 目录 **章 月经病月...
郑岱法国艺术与艺术学博士,天津美术学院油画系教授,硕士生导师,中国美术家协会会员。90年代末,赴法国巴黎第一大学攻读硕士、博士学位,巴黎美术学院研修。专著《油画...
Thisbookcapturestheessenceofanever-to-be-repeatedglimpseatthehistoryofmediaresea...
★日本政坛“下克上”传统从何而来?★★为何高官大臣屡遭暗杀?★★本书将为你讲述明治维新不为人知的残酷真相★★编辑推荐★◎下层武士为中心的明治维新史以一手材料为基...
黄信然。9月生。青年作家,出版公司编辑,现居广州。上过一些杂志,出过两本书。喜欢很多东西,只有文字坚持最久。身处人情世故庸俗世间,热爱人间烟火,并以此为情怀。已...
澳大利亚塔斯马尼亚大学电气工程和计算机科学系教授。他的许多研究课题都涉及人工智能和软计算。他一直致力于电气工程、过程控制和环境工程中智能系统的开发和应用。著有2...
温病大成-(第四部) 内容简介 《温病大成》为温病学专题文库,融合丛书、类书、全书、书目4种文献的特点,形成系统的带有简要注释的温病学文献集成。全书分6部:通论...
作品目录Part 1 迎接春天的缝纫专集 一周里的可爱装扮 洋溢春天气息的简单缝纫 最喜爱的手工生活 一个版型的不同风格快乐亲子装
作品目录第一章 厨房、卧室、起居室、玄关、浴室的整理方法和诀窍第二章 整理生活用品的方法和诀窍第三章 空当、不起眼之处的巧
日本主妇之友社日本著名的生活时尚类图书的出版社。自成立以来,出版过数百册生活实用类图书,收到日本读者的广泛喜爱。每年都有众多图书的版权出售到中国、韩国、泰国等国...
(美)顾德民Matthew C. Gutmann加州大学伯克利分校博士,布朗大学人类学系教授,布朗大学高研院院长,布朗大学国际研究中心研究员。作为人类学家,顾德...
作品目录韩寒·杯中窥人韩寒·书店韩寒·求医韩寒·头发郭敬明·剧本郭敬明·假如明天没有太阳郭敬明·我们最后的校园民谣张悦然
孔俪,上海华东师范大学心理健康教育专业,教育学学士。国家心理咨询师, 健康管理师,国际注册营养师。心理学狂热爱好者,积极心理学实践者。多年身心疾病干预与健康教育...
精彩摘录每次看到玻璃窗与铁窗很配的房子,我就会心跳加快,停下脚步,愣愣地看着发呆。那模样在外人眼里一定很怪吧。——引自第
《专门用途英语翻译的多维思考》内容简介:本书是以专门用途英语翻译与多维思考为研究对象所进行的实证性翻译研究。专门用途英语翻
建筑与都市-专辑:超级丹麦-今日丹麦建筑-046-中文版 本书特色 本书介绍了丹麦建筑人的新兴一代,他们的项目和作品正推动着一场崭新的建筑潮流—“超级丹麦”。本...
《科幻遇见大语文:冷酷的等式》 少儿科幻小说入门书;跨学科、跨领域大语文阅读新视野;清华大学附属小学高级语文教师、儿童文学阅读推广人、科学教师导读;刘慈欣、凡尔...
盖瑞·施耐德,20世纪美国著名诗人、散文家、翻译家、禅修者、生态哲学家,先后出版有二十余部诗文集,其中《龟岛》获得了1975年普利策诗歌奖。施耐德是“垮掉派”代...
蝴蝶蓝,阅文集团著名作家,被誉为“网游文神级大师”,江湖人称“虫爹”。作品幽默诙谐,人物性格鲜明,拥有众多死忠粉,尤其在年轻一代中人气超高,深受追捧。代表作《全...