Kindle clippings.txt with Python
[This article was first published on Max Humber, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exactly a year ago I posted Kindle clippings.txt with R. Since then things have changed… I’m a Pythonista now! Consequently, I thought it would be fun to update that post and parse highlights with 3.6+ and pandas. Janky, but it works:
import pandas as pd txt = """Sourdough (Robin Sloan) - Your Highlight on page 187 | Location 2853-2855 | Added on Tuesday, October 2, 2017 8:47:09 PM The world is going to change, I think—slowly at first, then faster than anyone expects. ========== Sapiens (Yuval Noah Harari) - Your Highlight on page 196 | Location 2996-2997 | Added on Tuesday, October 3, 2017 8:51:09 PM Evolution has made Homo sapiens, like other social mammals, a xenophobic creature. ========== Life 3.0 (Max Tegmark) - Your Highlight on page 75 | Location 1136-1137 | Added on Wednesday, October 11, 2017 6:00:15 PM In short, computation is a pattern in the spacetime arrangement of particles ========== """ with open('clippings.txt', 'w', encoding='utf-8-sig') as f: f.write(txt) with open('clippings.txt', 'r', encoding='utf-8-sig') as f: contents = f.read().replace(u'\ufeff', '') lines = contents.rsplit('==========') store = {'author': [], 'title': [], 'quote': []} for line in lines: try: meta, quote = line.split(')\n- ', 1) title, author = meta.split(' (', 1) _, quote = quote.split('\n\n') store['author'].append(author.strip()) store['title'].append(title.strip()) store['quote'].append(quote.strip()) except ValueError: pass df = pd.DataFrame(store) print(df.to_csv(index=False, encoding='utf-8-sig')) # author,quote,title # Robin Sloan,"The world is going to change, I think—slowly at first, then faster than anyone expects.",Sourdough # Yuval Noah Harari,"Evolution has made Homo sapiens, like other social mammals, a xenophobic creature.",Sapiens # Max Tegmark,"In short, computation is a pattern in the spacetime arrangement of particles",Life 3.0
Right now I’m 49 books deep. It’s crunch time, but I can see the end! Look for my annual 52 Quotes post in a couple of days!
To leave a comment for the author, please follow the link and comment on their blog: Max Humber.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.