List of Technologies 2015 Fall¶
Note that these are technologies used in the students projects from one of Big Data courses. These are reference only.
Tech | Type | Website | Description |
---|---|---|---|
BeautifulSoup | Python Package | link | Python package for parsing HTML and XML documents as a web scrapper |
dpkt | Python Package | link | |
Folium | Python package | link | Folium - Python map API with Leaflet.js |
gmplot | Python package | link | interface to plotting data with Google Maps |
Hadoopy | Python package | link | Hadoopy: Python wrapper for Hadoop using Cython |
HappyBase | Python package | link | Python Package for Apache HBase |
Indico API - IndicoIo | Python package | link | machine learning toolkits including sentiment analysis |
laspy | Python Package | link | python for las tool |
matplotlib | Python Package | link | matplotlib: Basic plotting library in Python; most other Python plotting libraries are built on top of it. |
NetworkX: complex networks | Python Package | link | Visualization for complex networks |
NLTK | Python Package | link | Natural Language Toolkit |
Numpy | Python package | link | Provides a fast numerical array structure and helper functions. |
Pandas | Python package | link | pandas: Provides a DataFrame structure to store data in memory and work with it easily and efficiently. |
pandasql | Python package | link | Python package for querying pandas DataFrames using SQL syntax |
Psycopg2 | Python package | link | PostgreSQL on Python |
Pygal | Python Package | link | Chart tool |
pygeoip | Python Package | link | |
pylab | Python Package | link | |
scikit-learn | Python Package | link | Python machine learning package |
scipy | Python Package | link | Python library for scientific computing |
Seaborn | Python Package | link | Python library for statistical data visualization |
tkinter | Python Package | link | Python interface to Tcl/Tk |
Tweepy | Python Package | link | Python package for Twitter |
Unirest | Python Package | link | Lightweight HTTP Request Client Libraries |
vaderSentiment | Python Package | link | Python package for (Valence Aware Dictionary and sEntiment Reasoner) |
Tech | Type | Website | Description |
---|---|---|---|
Apache Lucene | Software | link | text search engine library written in Java |
Cassandra | Software | link | NoSQL distributed database management system written in Java |
Cloudmesh | Software | link | Management framework for virtual environments |
gSplit | Software | link | File splitter |
HBase | Software | link | Written in Java |
Jupyter (IPython Notebook) | Software | link | Interactive Python with web interface |
LAStools : las2txt and mergelas | Software | link | LiDAR processing program |
Microsoft SQL Server Streaminsight | Software | link | Microsoft software for complex event processing (CEP) applications |
MongoDB | Software | link | Document oriented Database |
MPI (Open MPI) | Software | link | Message Passing Interface |
MS Visual Studio | Software | link | IDE for Microsoft software developments |
NetBeans | Software | link | IDE for Java development (mainly) and other languages |
Postgresql | Software | link | Object-relational database management system (ORDBMS) |
SQL Server Compact | Software | link | Free compact relational database provided by Microsoft |
sqlite | Software | link | File-based relational database management system written in C |
Tableau | Software | link | Visulaization |
Eclipse IDE | Software - Client Program | link | IDE for Java development (mainly) and other languages |
RStudio | Software - Client Program | link | Open Source IDE for R |
Tech | Type | Website | Description |
---|---|---|---|
Splunk for PCAP analyzer | Apps | link | Network packet capture and analyzer |
Tweetinvi | C# library | link | Tweetinvi - a friendly Twitter C# library |
Datumbox | Framework | link | Datumbox Machine Learning Framework written in Java |
Hadoop | Framework | link | MapReduce implementation wrtten in Java with HDFS (Hadoop Distributed File System) |
Spark | Framework | link | MapReduce implementation with in-memory primitives |
Apache POI | Java API | link | Java API for Microsoft Document |
JFreeChart | Java Library | link | Java Chart library |
Bootstrap.js | Javascript | link | Web templates |
C3js | Javascript - visualization | link | D3-based reusable chart library |
D3 | Javascript - visualization | link | Javascript library for visualization on the web |
Raw | Javascript - visualization | link | create vector visualizations with d3.js from csv files |
jQuery | Javascript library | link | Javascript library to simplify JS functions e.g. Ajax |
MLLib | Library | link | Spark’s Machine Learning Library |
Pig | Platform | link | High-level programming tool for MapReduce |
Apache Commons | Project | link | Apache Commons provides reusable Java components |
Karst at IU | Resource | link | HPC at Indiana University |
Amazon RDS | Web Services | link | Amazon Relational Database Service (RDS) |
Google Charts | Web Services | link | Javascript chart tools with HTTP requests |
Google Earth API | Web Services | link | (deprecated) |
Google Geochart | Web Services | link |
|
Google Geolocation API | Web Services | link | returns a location and accuracy radius based on information about cell towers and WiFi nodes that the mobile client can detect |
Mashape | Web services | link | Private company that offers open source tools and cloud services |
Plotly | Web services | link | visualization in Excel R Python |
Cloudera | Web Services - Hadoop | link | Hadoop-based software company which provides CDH (Cloudera Distribution Including Apache Hadoop) |
Tech | Type | Website | Description |
---|---|---|---|
C50 | R package | link | Decision Trees |
caret | R package | link | Classification and Regression Training |
corrplot | R package | link | To plot the correlation between the features |
cowplot | R package | link | simple add-on to ggplot2 in R |
Deducer | R package | link | Plot ROC plot for logistic regression model |
dplyr | R package | link | A Grammar of Data Manipulation |
e1071 | R package | link | Svm and naïve bayes |
ggmap | R package | link | Spatial Visualization with ggplot2 |
ggplot2 | R package | link | Visualizations of feature interactions |
ggthemes | R package | link | Some extra geoms scales and themes for ggplot in R |
glm2 | R package | link | glm2: Fitting Generalized Linear Models (Logistic Regression Analysis) |
MASS | R package | link | Use stepAIC to generate top 10 models according to AIC criterion |
randomforest | R package | To evaluate the importance of each feature given the predict credit card purchase. | |
pitchRx | R package | link | Major League Baseball (MLB) data and visualization PITCHf/x in R |
plyr | R package | link | Splitting Applying and Combining Data in R |
pROC | R package | link | Display and Analyze ROC Curves |
randomForest | R package | link | Breiman and Cutler’s Random Forests for Classification and Regression) |
rattle | R package | link | Graphical User Interface for Data Mining in R (v.4.0.5) |
RColorBrewer | R package | link | ColorBrewer Palettes |
reshape2 | R package | link | R package to transform data between wide and long formats with two key functions: melt and cast |
Rglpk | R package | link | R/GNU Linear Programming Kit Interface in R |
rpart | R package | link | Recursive Partitioning and Regression Trees |
rpart.plot | R package | Plot ‘rpart’ Models: An Enhanced Version of ‘plot.rpart’(v.1.5.3) | |
RSQLite | R package | link | SQLite Interface for R |