Text Analysis
Methods
• Topic Modeling
• Information Retrieval
• Text Classification
• Sentiment Analysis
• Word Frequency Analysis
• Named Entity Recognition
• Collocation
• Word Embeddings
• Transformer Models
• Concordancing
Tools
- Voyant Tools – web-based reading and analysis environment for digital texts
- Mallet – Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text
- WordSeer 4 – text analysis environment that combines visualization, information retrieval, sensemaking, and natural language processing
- Orange Text Mining – open-source machine learning and data visualization for novices and experts
- Antconc – freeware corpus analysis toolkit for concordancing and text analysis
- Lexos – text analysis tool offering both web-app and local installation options
- Constellate – create corpora from JSTOR’s collections with a built-in Python analysis platform
- HathiTrust Research Center Analytics – supports large-scale computational analysis of the HathiTrust Digital Library
- HTRC Algorithms – tools for assembling and analyzing HathiTrust corpus collections (includes copyrighted items)
- Extracted Features Dataset – dataset for non-consumptive analysis of HathiTrust corpus features
- HathiTrust + Bookworm – visualize and analyze word usage trends in the HathiTrust corpus
- HTRC Data Capsule – secure computing environment for text analysis on HathiTrust corpus
- Google Books Ngram Viewer – graph usage of terms/phrases over time
- TAPoR 3 – sophisticated text analysis and retrieval tools
Visual Analysis
Tools
- IIIF – standardize delivery of images and audio/visual files with interactive annotation capabilities
- Loris – IIIF image server written in Python
- Cantaloupe – open-source dynamic image server written in Java
- Mirador 3 – open-source, multi-window image viewing platform with zoom, compare, and annotation features
- Universal Viewer – a community-developed tool for viewing various file types
- CatchPy – annotation server for IIIF image assets
- Tropy – organize and annotate photos of archival resources
- CVAT – open-source tool for image and video annotation
Data Visualizations
Tools
- Tableau – query databases and spreadsheets to generate graph-type data visualizations
- Flourish – create stunning charts, maps, and interactive content with no coding required
- Datawrapper – create interactive, responsive & beautiful data visualizations
- Google Looker Studio – convert data into customizable reports and dashboards
- Infogr.am – free basic account with optional fee-based infographic service
- Piktochart – convenient infographic editor
- Canva – experiment with data visualization using hundreds of free design elements
- Easel.ly – thousands of free infographic templates and design objects
- D3.js – JavaScript library for bespoke data visualization
Digital Annotation
Tools
- Hypothes.is – open annotation software running through a Chrome browser extension
- Tropy – organize and annotate photos of archival resources
- Recognito – annotate documents and photographs with a simple web-app
- Annotation Studio – collaborative web-based annotation tools from MIT HyperStudio
- Neatline – add-on tools for Omeka with image and map annotation capabilities
- Scalar – scholarly publishing software with built-in annotation tools for multiple media types
Spatial Analysis and Web Mapping
Tools
- QGIS – a free and open-source desktop geographic information system application
- CARTO – SaaS spatial analysis platform with GIS, web mapping, and data visualization features
- Esri ArcGIS Online – create and share ArcGIS maps online
- Neatline – tell stories with maps and timelines using this Omeka add-on
- Google Maps API – create real-world experiences with Maps, Routes, and Places features
- Open Layers – put dynamic maps in any web page
- Mapbox – build customizable maps for web, mobile, automotive, and AR
- Story Maps – combine authoritative maps with narrative text, images, and multimedia
- Palladio – visualize complex historical data with ease
- Clio – educational app using GPS to connect users to the surrounding history
- Leaflet – JavaScript library for interactive maps
- Tilegrams – create tiled cartograms online
- MapAlList – create customized Google maps from address lists
Network Analysis
Tools
- Gephi – visualization and exploration software for graphs and networks
- Net.Create – an open-source tool for simultaneous multi-user network data entry
- Palladio – visualize complex historical data with ease
- Cytoscape – an open-source platform for visualizing complex networks
- NodeXL – Microsoft Excel plugin for network visualization and analysis
- NetworkX – Python package for network creation, manipulation, and analysis
- Igraph – network analysis tools with emphasis on efficiency and portability
- VizNetwork – R package for network visualization using vis.js library
- D3.js – JavaScript library for bespoke data visualization
Timeline and Temporal Analysis
Tools
- TimelineJS – open-source tool for building visually rich, interactive timelines
- Chronos Timeline – render interactive timelines in Obsidian notes from simple Markdown
- Neatline – tell stories with maps and timelines as Omeka add-on tools
- TimeGlider – web-based timeline builder
- TimeToast – create timelines to add to websites or blogs
- Viewshare – a free platform for generating interactive maps and timelines
Machine Learning
Tools
- ChatGPT
- Google Gemini
- Microsoft Copilot
- Check out our AI Toolkit!
Database Development
Tools
- FileMaker Pro – cross-platform relational database application
- PostgreSQL – a powerful, open-source object-relational database system
- MySQL – open-source relational database management system
- MongoDB – cross-platform, document-oriented database program
- Elasticsearch – distributed, RESTful search and analytics engine
- Solr – open source, multi-modal search platform built on Apache Lucene
- AWS DynamoDB – serverless, NoSQL database service
- Neo4J – graph database management system
- Datagrip – a cross-platform tool for relational and NoSQL databases
- Postico – native Mac app for PostgreSQL
- SQL Server Management Studio – configure, manage, and administer Microsoft SQL Server components
- Corpora – database, REST API, and data collection interface in one
Data Cleaning
Tools
- OpenRefine – an open-source desktop application for data cleanup and transformation
- Tidyverse – tidyr provides functions for getting to tidy data with a consistent form
- Pandas – open open-source data analysis and manipulation tool built on Python
Project Management
Tools
- Trello – web-based, kanban-style, list-making application
- Github Projects – an adaptable spreadsheet and task board that integrates with GitHub issues
- Asana – web and mobile work management platform
- Monday.com – adaptable project management software
- Airtable – variety of project management templates
Citation Management
Tools
- Zotero – free tool to collect, organize, cite, and share research
- EndNote – commercial reference management software for managing bibliographies
- Mendeley – reference management software
Digital Collections
Tools
- Omeka – free, flexible, open source web-publishing platform for libraries, museums, and scholarly collections
- Scalar – free, open source authoring and publishing platform for born-digital scholarship
- Story Maps – combine authoritative maps with narrative text, images, and multimedia
- Mukurtu – a content management system for sharing information in culturally relevant ways
- Neatline – tell stories with maps and timelines using Omeka add-on tools
Digital Publishing
Tools
- Juxta – open-source tool for comparing and collating multiple witnesses to a textual work
- Oxygen – a comprehensive suite of XML authoring and development tools
- Manifold Scholarship – open-source, free publication software
- eScholarship – scholarly publishing and repository services for UC-associated scholars
- UCLA Library Open Access Research Guide
- UCLA Library Open Access & Publishing
- Open Access Monograph via Luminos or TOME with publication fees covered by the library for UCLA faculty authors with support from Arcadia
- UCLA faculty authors can submit monograph proposals to the relevant UC Press Editor
Web Development
Tools
- Drupal – free, open-source content management system
- WordPress – web content management system
- Google Sites – free, easy-to-use website builder
- GitHub Pages
Data Curation and Management
Tools
- Git – distributed version control system for tracking file versions
- Dataverse – open source research data repository software
- Github – developer platform for creating, storing, managing, and sharing code
Programming Languages and Packages
Tools
- Jupyter Notebooks – free software, open standards, and web services for interactive computing across all programming languages
Python
- Natural Language Toolkit (NLTK) – leading platform for building Python programs to work with human language data
- SpaCy – free, open-source library for advanced Natural Language Processing in Python
- Gensim – free open-source Python library for representing documents as semantic vectors
- Matplotlib – comprehensive library for creating static, animated, and interactive visualizations in Python
- Seaborn – Python data visualization library based on matplotlib
- Plotly – open source graphing library
R
- Quanteda – R package for managing and analyzing text
- Tidytext – text mining with R
- SpaCyR – provides a convenient R wrapper around the Python spaCy package
- Ggplot2 – system for declaratively creating graphics based on The Grammar of Graphics
- Plotly – open source graphing library
Coding
- Oxygen XML Editor – premier tool for XML editing, authoring and development
Transcription
Tools
- Abby Finereader
- Scripto – open-source tool for viewing and transcribing digital files
- OTranscribe – free web-based audio transcription interface
- eScriptorium – digital text production pipeline for print and handwritten texts using machine learning
- Transkribus – AI platform for automatically recognizing text, layout, and structure in historical documents
- Amazon Textract – machine learning service that automatically extracts text and data from scanned documents
- Google Document AI – create document processors that automate tasks and improve data extraction