• Home
  • Docker
  • Kubernetes
  • LLMs
  • Java
  • Ubuntu
  • Maven
  • Big Data
  • Archived
Big Data | Spark SQL CLI: spark-sql
  1. spark-sql command line options
  2. Start spark-sql
  3. spark-shell commands

  1. spark-sql command line options
    spark-sql is a convenient tool to run the Hive Metastore service in local mode and execute queries from the command line.
    Note that the spark-sql cannot talk to the Thrift JDBC server.
    Configuration of Hive is done by placing your hive-site.xml, core-site.xml and hdfs-site.xml files in ${SPARK_HOME}/conf/.


    • Generic options:
    • Cluster deploy mode only:
    • Spark standalone or Mesos with cluster deploy mode only:
    • Spark standalone and Mesos only:
    • Spark standalone and YARN only:
    • YARN only:
    • CLI options:
  2. Start spark-sql

    To exit spark-sql type exit;

    Please note the usage of the hive configuration --hiveconf "hive.exec.scratchdir=/tmp/a-folder-that-the-current-user-has-permission-to-write-in"
    By default, the configuration "hive.exec.scratchdir" has the value to "/tmp/hive"
    If you don't initialize this configuration, you might get this error when starting spark-sql:
  3. spark-shell commands
    You can use one the following commands:
    Query: show databases
© 2025  mtitek