List of Technologies 2015 Fall

Note that these are technologies used in the students projects from one of Big Data courses. These are reference only.

Python Packages
Tech Type Website Description
BeautifulSoup Python Package link Python package for parsing HTML and XML documents as a web scrapper
dpkt Python Package link  
Folium Python package link Folium - Python map API with Leaflet.js
gmplot Python package link interface to plotting data with Google Maps
Hadoopy Python package link Hadoopy: Python wrapper for Hadoop using Cython
HappyBase Python package link Python Package for Apache HBase
Indico API - IndicoIo Python package link machine learning toolkits including sentiment analysis
laspy Python Package link python for las tool
matplotlib Python Package link matplotlib: Basic plotting library in Python; most other Python plotting libraries are built on top of it.
NetworkX: complex networks Python Package link Visualization for complex networks
NLTK Python Package link Natural Language Toolkit
Numpy Python package link Provides a fast numerical array structure and helper functions.
Pandas Python package link pandas: Provides a DataFrame structure to store data in memory and work with it easily and efficiently.
pandasql Python package link Python package for querying pandas DataFrames using SQL syntax
Psycopg2 Python package link PostgreSQL on Python
Pygal Python Package link Chart tool
pygeoip Python Package link  
pylab Python Package link  
scikit-learn Python Package link Python machine learning package
scipy Python Package link Python library for scientific computing
Seaborn Python Package link Python library for statistical data visualization
tkinter Python Package link Python interface to Tcl/Tk
Tweepy Python Package link Python package for Twitter
Unirest Python Package link Lightweight HTTP Request Client Libraries
vaderSentiment Python Package link Python package for (Valence Aware Dictionary and sEntiment Reasoner)
Software
Tech Type Website Description
Apache Lucene Software link text search engine library written in Java
Cassandra Software link NoSQL distributed database management system written in Java
Cloudmesh Software link Management framework for virtual environments
gSplit Software link File splitter
HBase Software link Written in Java
Jupyter (IPython Notebook) Software link Interactive Python with web interface
LAStools : las2txt and mergelas Software link LiDAR processing program
Microsoft SQL Server Streaminsight Software link Microsoft software for complex event processing (CEP) applications
MongoDB Software link Document oriented Database
MPI (Open MPI) Software link Message Passing Interface
MS Visual Studio Software link IDE for Microsoft software developments
NetBeans Software link IDE for Java development (mainly) and other languages
Postgresql Software link Object-relational database management system (ORDBMS)
SQL Server Compact Software link Free compact relational database provided by Microsoft
sqlite Software link File-based relational database management system written in C
Tableau Software link Visulaization
Eclipse IDE Software - Client Program link IDE for Java development (mainly) and other languages
RStudio Software - Client Program link Open Source IDE for R
Others
Tech Type Website Description
Splunk for PCAP analyzer Apps link Network packet capture and analyzer
Tweetinvi C# library link Tweetinvi - a friendly Twitter C# library
Datumbox Framework link Datumbox Machine Learning Framework written in Java
Hadoop Framework link MapReduce implementation wrtten in Java with HDFS (Hadoop Distributed File System)
Spark Framework link MapReduce implementation with in-memory primitives
Apache POI Java API link Java API for Microsoft Document
JFreeChart Java Library link Java Chart library
Bootstrap.js Javascript link Web templates
C3js Javascript - visualization link D3-based reusable chart library
D3 Javascript - visualization link Javascript library for visualization on the web
Raw Javascript - visualization link create vector visualizations with d3.js from csv files
jQuery Javascript library link Javascript library to simplify JS functions e.g. Ajax
MLLib Library link Spark’s Machine Learning Library
Pig Platform link High-level programming tool for MapReduce
Apache Commons Project link Apache Commons provides reusable Java components
Karst at IU Resource link HPC at Indiana University
Amazon RDS Web Services link Amazon Relational Database Service (RDS)
Google Charts Web Services link Javascript chart tools with HTTP requests
Google Earth API Web Services link (deprecated)
Google Geochart Web Services link
a map of a country a continent or a region with areas identified in one of three ways: region markers and text
Google Geolocation API Web Services link returns a location and accuracy radius based on information about cell towers and WiFi nodes that the mobile client can detect
Mashape Web services link Private company that offers open source tools and cloud services
Plotly Web services link visualization in Excel R Python
Cloudera Web Services - Hadoop link Hadoop-based software company which provides CDH (Cloudera Distribution Including Apache Hadoop)
R packages
Tech Type Website Description
C50 R package link Decision Trees
caret R package link Classification and Regression Training
corrplot R package link To plot the correlation between the features
cowplot R package link simple add-on to ggplot2 in R
Deducer R package link Plot ROC plot for logistic regression model
dplyr R package link A Grammar of Data Manipulation
e1071 R package link Svm and naïve bayes
ggmap R package link Spatial Visualization with ggplot2
ggplot2 R package link Visualizations of feature interactions
ggthemes R package link Some extra geoms scales and themes for ggplot in R
glm2 R package link glm2: Fitting Generalized Linear Models (Logistic Regression Analysis)
MASS R package link Use stepAIC to generate top 10 models according to AIC criterion
randomforest R package   To evaluate the importance of each feature given the predict credit card purchase.
pitchRx R package link Major League Baseball (MLB) data and visualization PITCHf/x in R
plyr R package link Splitting Applying and Combining Data in R
pROC R package link Display and Analyze ROC Curves
randomForest R package link Breiman and Cutler’s Random Forests for Classification and Regression)
rattle R package link Graphical User Interface for Data Mining in R (v.4.0.5)
RColorBrewer R package link ColorBrewer Palettes
reshape2 R package link R package to transform data between wide and long formats with two key functions: melt and cast
Rglpk R package link R/GNU Linear Programming Kit Interface in R
rpart R package link Recursive Partitioning and Regression Trees
rpart.plot R package   Plot ‘rpart’ Models: An Enhanced Version of ‘plot.rpart’(v.1.5.3)
RSQLite R package link SQLite Interface for R