python 学习笔记

python 学习笔记 Python学习笔记 Jeanhwea Hoope 2013年 8月 12日 Contents 1 Preface 2 2 Personal statement 2 3 Start Point 2 3.1 json format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 3.2 Get HTML with urllib2 . . . . . . . . . . ...

Python学习笔记 Jeanhwea Hoope 2013年 8月 12日 Contents 1 Preface 2 2 Personal statement 2 3 Start Point 2 3.1 json format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 3.2 Get HTML with urllib2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3.3 Send email with smtplib . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3.4 sgmllib can parse HTML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 Resources on the Internet 5 4.1 Using Googlemaps’ API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 5 About database 6 5.1 Connect to MySql . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 6 A little words 8 1 1 Preface 大二的这一年暑假，闲来没事，就想动手做一点事情吧，不然都感觉不到自己的存在感了，刚好假期前再看 Python，于是就乘机学习一下 Python。好久没写文档了，但是不写又不行。感觉脑子记太多了又记不住，随手写两笔，用来记录 Python 比较好玩的一面，日后翻来，必有所得。 2 Personal statement 这里是一些关于文档符号的说明：变量 variables 命令行 CMD>>> python filename 3 Start Point 虽然是一个起点，但是起点还算是比较高的，我没有记录一些基本的语法，语法那些东西，学过 C 语言就什么都解决了。至于 Python其他的方面吗，就在此尽情的展示喽。 3.1 json format json简单来说就是一种文本的格式，利用 Python内置的 json模块可以进行比较好的处理效果。下面就给出一个样例。如果有这样的一个名为 log.json的文件，内容如下： {"price":255.84, "time":"23:59:58", "date":"2013-08-05", "net":-779.94} {"price":255.84, "time":"00:00:00", "date":"2013-08-06", "net":-779.94} {"price":255.82, "time":"00:00:09", "date":"2013-08-06", "net":-780.78} ... {"price":255.80, "time":"00:01:59", "date":"2013-08-06", "net":-781.62} {"price":255.79, "time":"00:02:11", "date":"2013-08-06", "net":-782.04} {"price":255.79, "time":"00:02:22", "date":"2013-08-06", "net":-782.04} 对于上面的文件，我们可以进行信息提取。下面的给出一段程序，可以对文档进行一些处理，最终得到了一些格式化输出。 fmt.py import j s on 2 f i l ename = ’ l o g . j son ’ id = 1 4 with open ( f i l ename , ’ r ’ ) as f : for l i n e in f : 2 6 s = j son . l oads ( l i n e ) i f s [ ’ p r i c e ’ ] != 0 : 8 print ’%d , %s , ”%s ” , ”%s” ’ %(id , s [ ’ p r i c e ’ ] , s [ ’ t ime ’ ] , s [ ’ date ’ ] ) id += 1 输出结果如下： 1, 255.84, "23:59:58", "2013-08-05" 2, 255.84, "00:00:00", "2013-08-06" 3, 255.82, "00:00:09", "2013-08-06". ... 10, 255.8, "00:01:59", "2013-08-06" 11, 255.79, "00:02:11", "2013-08-06" 12, 255.79, "00:02:22", "2013-08-06" 3.2 Get HTML with urllib2 urllib2模块是 Python的一个获取 URLs (Uniform Resource Locators)的组件。他以 urlopen函数的形式提供了一个非常简单的接口，便于抓取网络上的 html资源。下面是一段抓取谷歌主页的样例，输出的结果就是一个纯文本的 html文件。 urlopen from u r l l i b 2 import ur lopen 2 u r l = ’ h t t p ://www. goog l e . com ’ try : 4 page = urlopen ( u r l ) html = page . read ( ) 6 # show what we ge t print html 8 except : # do noth ing 10 pass 利用这个 urlopen就可以进行网页资源的爬取，很好玩吧！ 3.3 Send email with smtplib smtplib模块是 Python中 smtp (简单邮件传输协议 )的客户端实现。我们可以使用 smtplib模块，轻松的发送电子邮件。下面的代码是可以给别人的邮箱发邮件的，先使用MIME封装邮件，然后利用 SMTP协议进行发送，期间使用的就是 smtplib模块。 3 email.py from emai l .mime . mul t ipart import MIMEMultipart 2 from emai l .mime . t ex t import MIMEText import smtpl ib 4 # some infomat ion frm = ’ xxxxxxx@126 . com ’ 6 to = ’ xxxxxxxx@vip . qq . com ’ sub = ’ ’ # s u b j e c t 8 message = ’ ’ # g i v e a message # cons t ruc t an emai l 10 msg = MIMEMultipart ( ) msg [ ’To ’ ] = to 12 msg [ ’From ’ ] = ’=.= ’+ ’ ’ + frm msg [ ’ Sub j e c t ’ ] = sub 14 msg [ ’ Date ’ ] = u t i l s . formatdate ( l o c a l t ime = True ) msg [ ’Message−ID ’ ] = u t i l s . make msgid ( ) 16 body = MIMEText(message , subtype= ’ p l a i n ’ , c h a r s e t= ’ u t f−8 ’ ) msg . attach ( body ) 18 # send i t smtp = smtpl ib .SMTP( ) 20 smtp . connect ( ’ smtp . 126 . com ’ ) smtp . l o g i n ( frm , ’ password ’ ) 22 smtp . sendmai l ( frm , to , msg . a s s t r i n g ( ) ) smtp . qu i t ( ) 可以尽情的发邮件了，至于多少封吗？呵呵。。。 3.4 sgmllib can parse HTML sgmllib包含一个重要的类: SGMLParser。 SGMLParser将 HTML分解成有用的片段，比如开始标记和结束标记。一旦它成功地分解出某个数据为一个有用的片段，它会根据所发现的数据，调用一个自身内部的方法。为了使用这个分析器，您需要子类化 SGML-Parser 类，并且覆盖这些方法。下面的代码就是一种实现，具体先看代码，LinksParser继承 SGMLParser并对里面的方法进行了重写，进而得到我们想要的资源。 xsgllib.py import sgml l ib , u r l l i b , u r l p a r s e 2 class LinksParser ( s gml l i b . SGMLParser ) : def i n i t ( s e l f ) : 4 4 s gml l i b . SGMLParser . i n i t ( s e l f ) s e l f . seen = se t ( ) 6 def do l i nk ( s e l f , a t t r i b u t e s ) : for name , va lue in a t t r i b u t e s : 8 i f name == ’ h r e f ’ and value not in s e l f . seen : s e l f . seen . add ( value ) 10 p i e c e s = ur l pa r s e . u r l p a r s e ( va lue ) i f p i e c e s [ 0 ] != ’ h t t p ’ : return 12 print u r l pa r s e . ur lunparse ( p i e c e s ) return 14 p = LinksParser ( ) f = u r l l i b . ur lopen ( ’ h t t p ://www. python . org / index . html ’ ) 16 BUFSIZE = 8192 while True : 18 data = f . read (BUFSIZE) i f not data : break 20 p . f e ed ( data ) p . c l o s e ( ) 利用 sgmllib解析 html文档，并将标签中的“ herf”给抓出来。结果如下： http://www.python.org/channews.rdf http://aspn.activestate.com/ASPN/Cookbook/Python/index_rss http://python-groups.blogspot.com/feeds/posts/default http://www.showmedo.com/latestVideoFeed/rss2.0?tag=python http://www.awaretek.com/python/index.xml http://feeds.feedburner.com/PythonSoftwareFoundationNews http://www.python.org/dev/peps/peps.rss http://www.python.org/community/jobs/jobs.rss http://www.reddit.com/r/Python/.rss http://feeds.feedburner.com/PythonInsider 接下来，可以用这些链接下载资源了。 4 Resources on the Internet 有关于 Python可以利用的网络资源也很丰富，各种网站提供了 API便于开发者的工作。 5 4.1 Using Googlemaps’ API 下面是一段谷歌提供的程序，利用这段程序可以进行一些地理知识可以查询。更多的介绍见： https://developers.google.com/places/documentation/ googlemaps.py import s imple j son , u r l l i b 2 GEOCODE BASE URL = ’ h t t p ://maps . g oo g l e a p i s . com/maps/ api / geocode/ j son ’ def geocode ( address , sensor , ∗∗ geo a rg s ) : 4 geo a rg s . update ({ ’ address ’ : address , 6 ’ sensor ’ : s en so r }) 8 u r l = GEOCODE BASE URL + ’ ? ’ + u r l l i b . ur l encode ( geo a rg s ) 10 r e s u l t = s imp l e j s on . load ( u r l l i b . ur lopen ( u r l ) ) print s imp l e j s on . dumps ( [ s [ ’ f o rmat t ed addre s s ’ ] 12 for s in r e s u l t [ ’ r e s u l t s ’ ] ] , indent=2) 14 i f name == ’ main ’ : geocode ( address=”∗∗∗+∗∗∗+∗∗∗” , s en so r=” f a l s e ” ) 如果你将 address 的值附成 chongqing，你就能得到如下的信息，这样方便从一个地名转化成为一个国家的名字。还可以利用 +进行多关键字搜索。 [ "Chongqing, China" ] 5 About database 这里主要记录数据库相关的资料 5.1 Connect to MySql 要利用一个数据库，就先连接上数据库。如下给出一个代码示例，教你如何连接上 MySql。其中 MySql数据库的目录结构如图 Figure 1。更多例子可参考： http://dev.mysql.com/doc/refman/5.7/en/myconnpy example connecting.html 6 Figure 1: 数据库目录结构 test connect.py import mysql . connector 2 c on f i g = { ’ user ’ : ’ ∗∗∗∗∗∗ ’ , 4 ’ password ’ : ’ ∗∗∗∗∗∗ ’ , ’ da tabase ’ : ’ l o g ’ , 6 ’ hos t ’ : ’ l o c a l h o s t ’ , ’ por t ’ : ’ 3306 ’ 8 } cnx = mysql . connector . connect (∗∗ c on f i g ) 10 cur so r = cnx . cur so r ( ) query = (”SELECT ∗ FROM in f ORDER BY pr i c e DESC LIMIT 3” ) 12 cur so r . execute ( query ) for item in cur so r : 14 for t in item : print t , 16 print cur so r . c l o s e ( ) 18 cnx . c l o s e ( ) 查找结果为： 20 258.86 "23:26:50" "2013-08-01" 7 22 258.84 "23:27:13" "2013-08-01" 21 258.84 "23:27:02" "2013-08-01" 6 A little words 我其实真不想说什么，本来我就是很钟爱 LATEX，但是中文的编码真是搞得我很忧伤，完全不知道 CTEX兼容中文的编码是怎么做的。不过能够享受到免费的服务，还是比较知足。 —— 2013年 8月 10日 LATEX的中文目录到底要闹哪样？各种调不出来，网上说要在文档的最后加上 \newline，没理论依据，但可以神奇的治愈目录参杂中文问题，我试了一下，感觉完全不照的样子。实在受不了还是使用英文标题算了。 —— 2013年 8月 11日 8

                    本文档为【python 学习笔记】，请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑，
                    图片更改请在作品中右键图片并更换，文字修改请直接点击文字进行修改，也可以新增和删除文档中的内容。 
 该文档来自用户分享，如有侵权行为请发邮件ishare@vip.sina.com联系网站客服，我们会及时删除。

                    [版权声明] 本站所有资料为用户分享产生，若发现您的权利被侵害，请联系客服邮件isharekefu@iask.cn，我们尽快处理。

                    本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权，请谨慎使用。

                    网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传，仅限个人学习分享使用，禁止用于任何广告和商用目的。
                

下载需要：免费已有0 人下载

立即下载

python 学习笔记

你可能还喜欢