Text Processing in Python describes techniques for manipulation of text using the Python programming language. At the broadest level, text processing is simply taking textual information and doing something with it. This might be restructuring or reformatting it, extracting smaller bits of information from it, or performing calculations that depend on the text. Text processing is arguably what most programmers spend most of their time doing. Because Python is clear, expressive, and object-oriented it is a perfect language for doing text processing, even better than Perl. As the amount of data everywhere continues to increase, this is more and more of a challenge for programmers. This book is not a tutorial on Python. It has two other goals: helping the programmer get the job done pragmatically and efficiently; and giving the reader an understanding - both theoretically and conceptually - of why what works works and what doesn't work doesn't work. Mertz provides practical pointers and tips that emphasize efficent, flexible, and maintainable approaches to the textprocessing tasks that working programmers face daily.
From the Back Cover:
Text Processing in Python is an example-driven, hands-on tutorial that carefully teaches programmers how to accomplish numerous text processing tasks using the Python language. Filled with concrete examples, this book provides efficient and effective solutions to specific text processing problems and practical strategies for dealing with all types of text processing challenges.
Text Processing in Python begins with an introduction to text processing and contains a quick Python tutorial to get you up to speed. It then delves into essential text processing subject areas, including string operations, regular expressions, parsers and state machines, and Internet tools and techniques. Appendixes cover such important topics as data compression and Unicode. A comprehensive index and plentiful cross-referencing offer easy access to available information. In addition, exercises throughout the book provide readers with further opportunity to hone their skills either on their own or in the classroom. A companion Web site (http://gnosis.cx/TPiP) contains source code and examples from the book.
Here is some of what you will find in thie book:
* When do I use formal parsers to process structured and semi-structured data? Page 257
* How do I work with full text indexing? Page 199
* What patterns in text can be expressed using regular expressions? Page 204
* How do I find a URL or an email address in text? Page 228
* How do I process a report with a concrete state machine? Page 274
* How do I parse, create, and manipulate internet formats? Page 345
* How do I handle lossless and lossy compression? Page 454
* How do I find codepoints in Unicode? Page 465
TheresarealconnectionbetweencraftsmanshipandWebdesign.Thatsthethemerunningthroug...
大数据互联网大规模数据挖掘与分布式处理 本书特色 大数据时代的及时雨全球著名数据库技术专家*新力作理论与实际算法实现并重大数据互联网大规模数据挖掘与分布式处理 ...
《百年新路》内容简介:近年来,世界经济持续不景气,反全球化、民粹主义势力抬头,现有的国际政治经济秩序已难以为继,美国等西方
《复杂信息系统网络安全体系建设指南》内容简介:本书介绍复杂信息系统的网络安全体系建设,共分为7章,第一章为概述;第二章为安全
《储蓄投资金融政治经济学》内容简介:在货币经济中,储蓄投资决策分离,使得统一的积累过程变成三个相互联系而又相对独立的过程,
Visual Foxpro程序设计教程 本书特色 本书围绕“岳麓书院图书管理系统”实例,完整地描述了数据库应用系统开发的各个环节,将系统开发的具体步骤详细地贯穿...
《配色宝典》是一本集配色理论和配色实例为一体的便携式工具书。配色理论部分以日本视觉设计研究所研发的色立体为基础,将色彩分
《差错控制编码》围绕信道编码理论、技术及其应用,对各种编码方法的工程应用背景及发展前景作了详尽系统的介绍。全书共分9章,主
《Arduino创意机器人入门》内容简介:机器人教育融机械、传感与控制等内容为一体,让学生在手脑并用解决实际问题的过程中,有效地提
网络工程师教程(第三版) 本书特色 《网络工程师教程》根据人力资源和社会保障部、工业和信息化部文件,计算机技术与软件专业技术资格(水平)考试纳入全国专业技术人员...
提要:张永和创作并亲笔绘制的悬疑侦探故事绘本,全书包含四本装帧工艺各不相同的精美单册,分别为“绘本”“文本”“翻本”“彩
曼纽尔·卡斯特是闻名世界的社会科学家,《网络社会-跨文化的视角》是他主编的探讨网络社会在不同文化和制度中的模式和动态的论
《移动网络程序设计》详细阐述了如何在移动网络浏览器上构建高效和丰富的用户体验程序,以及各种离线应用程序或者微技程序,主要
《人像摄影构图与美资设计》内容简介:中艺影像学校是国内知名的摄影培训机构,十余年来培养了数万名摄影学习者。本书作者田德友是
ANSYS结构有限元高级分析方法与范例应用 内容简介 本书将结构有限元分析的基本力学概念与ANSYS实践紧密结合,通过大量生动的原创性分析实例,向读者系统全面地...
《西城追忆·抗战西城》内容简介:2015年初,西城区档案局(馆)与西城区文物研究所合作出版的《西城追忆·文物保护专辑》,深受各
《AndroidUI基础教程》介绍了Android编程专家JasonOstrander将展示如何为Android应用程序创建用户界面。《AndroidUI基础教...
《半小时漫画理财课》内容简介:作为曾经月入3000元的“月光族”,理财师八宝用5年时间攒下了人生第一个1000万元;作为从海外归来的
WebGIS原理及其应用——主要WebGIS平台开发实例 本书特色 《Web GIS原理及其应用:主要Web GIS平台开发实例》:地理信息系统教学丛书WebG...
内容提要本书初版于1978年,曾获1980年“新长征优秀科普作品奖”。这次重版,除对原有各篇根据近十多年来我国科技史研究方面的新