| filename | The original PDF filename from the ACM Digital Library | 
|---|---|
| year | The year of publication, CSCW 2018 online-first edition of PACMHCI is 2017.5 | 
| title_from_text | The paper title derived from the paper text, this may be incomplete or also include author names. | 
| lead_author | The lead author of the paper, based on the filename from the ACM DL | 
| num_pages | Number of pages in the PDF | 
| total_words | Total number of words in the paper, defined as tokens between any contiguious space | 
| total_words_nopunct | Total number of words after replacing all punctuation with spaces. | 
| body_len_words | Number of words in the paper's front matter and body, no references and appendices. Calculated with `total_words` method. | 
| body_len_words_nopunct | Number of words in the paper's front matter and body, no references and appendices. Calculated with `total_words_nopunct` method. | 
| body_len_chars | Number of characters in the paper's front matter and body, no references and appendices. | 
| ref_len_chars | Number of characters in the paper's reference section. | 
| ref_len_words | Number of words in the paper's reference section. Calculated with `total_words` method. | 
| ref_len_words_nopunct | Number of words in the paper's reference section. Calculated with `total_words_nopunct` method. | 
| appx_len_chars | Number of characters in the paper's appendix section. Value is nan if no appendix was found. | 
| appx_len_words | Number of words in the paper's appendix section. Calculated with `total_words` method. | 
| appx_len_words_nopunct | Number of words in the paper's appendix section. Calculated with `total_words_nopunct` method. | 
| ref_count_approx | Approximate number of references cited. | 
| words_per_page | Averge number of words (`total_words` method) per page | 
| words_nopunct_per_page | Average number of words (`total_words_nopunct` method) per page | 
| chars_per_word | Average number of characters per word (`total_words` method) | 
| chars_per_word_nopunct | Average number of characters per word (`total_words_nopunct` method) | 
| body_words_nopunct_per_ref_count | Average number of words in the paper per number of references cited. | 
| title_has_quote | 1 if title has a single or double quotation mark, 0 if not |