List of Technologies 2015 Fall¶
Note that these are technologies used in the students projects from one of Big Data courses. These are reference only.
| Tech | Type | Website | Description |
|---|---|---|---|
| BeautifulSoup | Python Package | link | Python package for parsing HTML and XML documents as a web scrapper |
| dpkt | Python Package | link | |
| Folium | Python package | link | Folium - Python map API with Leaflet.js |
| gmplot | Python package | link | interface to plotting data with Google Maps |
| Hadoopy | Python package | link | Hadoopy: Python wrapper for Hadoop using Cython |
| HappyBase | Python package | link | Python Package for Apache HBase |
| Indico API - IndicoIo | Python package | link | machine learning toolkits including sentiment analysis |
| laspy | Python Package | link | python for las tool |
| matplotlib | Python Package | link | matplotlib: Basic plotting library in Python; most other Python plotting libraries are built on top of it. |
| NetworkX: complex networks | Python Package | link | Visualization for complex networks |
| NLTK | Python Package | link | Natural Language Toolkit |
| Numpy | Python package | link | Provides a fast numerical array structure and helper functions. |
| Pandas | Python package | link | pandas: Provides a DataFrame structure to store data in memory and work with it easily and efficiently. |
| pandasql | Python package | link | Python package for querying pandas DataFrames using SQL syntax |
| Psycopg2 | Python package | link | PostgreSQL on Python |
| Pygal | Python Package | link | Chart tool |
| pygeoip | Python Package | link | |
| pylab | Python Package | link | |
| scikit-learn | Python Package | link | Python machine learning package |
| scipy | Python Package | link | Python library for scientific computing |
| Seaborn | Python Package | link | Python library for statistical data visualization |
| tkinter | Python Package | link | Python interface to Tcl/Tk |
| Tweepy | Python Package | link | Python package for Twitter |
| Unirest | Python Package | link | Lightweight HTTP Request Client Libraries |
| vaderSentiment | Python Package | link | Python package for (Valence Aware Dictionary and sEntiment Reasoner) |
| Tech | Type | Website | Description |
|---|---|---|---|
| Apache Lucene | Software | link | text search engine library written in Java |
| Cassandra | Software | link | NoSQL distributed database management system written in Java |
| Cloudmesh | Software | link | Management framework for virtual environments |
| gSplit | Software | link | File splitter |
| HBase | Software | link | Written in Java |
| Jupyter (IPython Notebook) | Software | link | Interactive Python with web interface |
| LAStools : las2txt and mergelas | Software | link | LiDAR processing program |
| Microsoft SQL Server Streaminsight | Software | link | Microsoft software for complex event processing (CEP) applications |
| MongoDB | Software | link | Document oriented Database |
| MPI (Open MPI) | Software | link | Message Passing Interface |
| MS Visual Studio | Software | link | IDE for Microsoft software developments |
| NetBeans | Software | link | IDE for Java development (mainly) and other languages |
| Postgresql | Software | link | Object-relational database management system (ORDBMS) |
| SQL Server Compact | Software | link | Free compact relational database provided by Microsoft |
| sqlite | Software | link | File-based relational database management system written in C |
| Tableau | Software | link | Visulaization |
| Eclipse IDE | Software - Client Program | link | IDE for Java development (mainly) and other languages |
| RStudio | Software - Client Program | link | Open Source IDE for R |
| Tech | Type | Website | Description |
|---|---|---|---|
| Splunk for PCAP analyzer | Apps | link | Network packet capture and analyzer |
| Tweetinvi | C# library | link | Tweetinvi - a friendly Twitter C# library |
| Datumbox | Framework | link | Datumbox Machine Learning Framework written in Java |
| Hadoop | Framework | link | MapReduce implementation wrtten in Java with HDFS (Hadoop Distributed File System) |
| Spark | Framework | link | MapReduce implementation with in-memory primitives |
| Apache POI | Java API | link | Java API for Microsoft Document |
| JFreeChart | Java Library | link | Java Chart library |
| Bootstrap.js | Javascript | link | Web templates |
| C3js | Javascript - visualization | link | D3-based reusable chart library |
| D3 | Javascript - visualization | link | Javascript library for visualization on the web |
| Raw | Javascript - visualization | link | create vector visualizations with d3.js from csv files |
| jQuery | Javascript library | link | Javascript library to simplify JS functions e.g. Ajax |
| MLLib | Library | link | Spark’s Machine Learning Library |
| Pig | Platform | link | High-level programming tool for MapReduce |
| Apache Commons | Project | link | Apache Commons provides reusable Java components |
| Karst at IU | Resource | link | HPC at Indiana University |
| Amazon RDS | Web Services | link | Amazon Relational Database Service (RDS) |
| Google Charts | Web Services | link | Javascript chart tools with HTTP requests |
| Google Earth API | Web Services | link | (deprecated) |
| Google Geochart | Web Services | link |
|
| Google Geolocation API | Web Services | link | returns a location and accuracy radius based on information about cell towers and WiFi nodes that the mobile client can detect |
| Mashape | Web services | link | Private company that offers open source tools and cloud services |
| Plotly | Web services | link | visualization in Excel R Python |
| Cloudera | Web Services - Hadoop | link | Hadoop-based software company which provides CDH (Cloudera Distribution Including Apache Hadoop) |
| Tech | Type | Website | Description |
|---|---|---|---|
| C50 | R package | link | Decision Trees |
| caret | R package | link | Classification and Regression Training |
| corrplot | R package | link | To plot the correlation between the features |
| cowplot | R package | link | simple add-on to ggplot2 in R |
| Deducer | R package | link | Plot ROC plot for logistic regression model |
| dplyr | R package | link | A Grammar of Data Manipulation |
| e1071 | R package | link | Svm and naïve bayes |
| ggmap | R package | link | Spatial Visualization with ggplot2 |
| ggplot2 | R package | link | Visualizations of feature interactions |
| ggthemes | R package | link | Some extra geoms scales and themes for ggplot in R |
| glm2 | R package | link | glm2: Fitting Generalized Linear Models (Logistic Regression Analysis) |
| MASS | R package | link | Use stepAIC to generate top 10 models according to AIC criterion |
| randomforest | R package | To evaluate the importance of each feature given the predict credit card purchase. | |
| pitchRx | R package | link | Major League Baseball (MLB) data and visualization PITCHf/x in R |
| plyr | R package | link | Splitting Applying and Combining Data in R |
| pROC | R package | link | Display and Analyze ROC Curves |
| randomForest | R package | link | Breiman and Cutler’s Random Forests for Classification and Regression) |
| rattle | R package | link | Graphical User Interface for Data Mining in R (v.4.0.5) |
| RColorBrewer | R package | link | ColorBrewer Palettes |
| reshape2 | R package | link | R package to transform data between wide and long formats with two key functions: melt and cast |
| Rglpk | R package | link | R/GNU Linear Programming Kit Interface in R |
| rpart | R package | link | Recursive Partitioning and Regression Trees |
| rpart.plot | R package | Plot ‘rpart’ Models: An Enhanced Version of ‘plot.rpart’(v.1.5.3) | |
| RSQLite | R package | link | SQLite Interface for R |