MTI TEK
  • Home
  • About
  • LLMs
  • Docker
  • Kubernetes
  • Java
  • All Resources
Big Data & Distributed Systems
  1. Big Data ecosystem
  2. Apache Hadoop
    1. Install and configure Apache Hadoop (single node cluster) (3.3.0)
    2. HDFS Commands
      1. HDFS - DFS Commands
      2. HDFS - DFSADMIN Commands
    3. ORC/Parquet/Avro Tools
      1. ORC Tools (1.5.4)
      2. Parquet Tools (1.9.0)
      3. Avro Tools (1.9.0)
  3. Apache Hive
    1. Install and configure Apache Hive (HiveServer, Hive MetaStore) (3.1.2)
    2. Manage Hive Databases
  4. Apache Spark
    1. Install and configure Apache Spark (standalone) (3.0.0)
    2. Access Hive Tables using Spark SQL
    3. Spark Tools
      1. Spark Interactive Shell (Scala): spark-shell
      2. Spark Interactive Shell (Python): pyspark
      3. Spark Interactive Shell (R): sparkR
      4. Submitting Applications: spark-submit
      5. Spark SQL CLI: spark-sql
    4. Spark API: RDD, DataFrame, Dataset
© 2025 mtitek