I am using apache Zeppelin for sometime now. I want to decide on a good tool for reporting using Spark/Scala. Please compare Zeppelin and Tableau.
Thanks
The Syncfusion Dashboard Designer, provides support to design the dashboard against the BigData through SparkSQL Or Hive connection. Refer,
https://help.syncfusion.com/dashboard-platform/dashboard-designer/connecting-to-data/connecting-to-data#connecting-to-spark-sql-data
https://help.syncfusion.com/dashboard-platform/dashboard-designer/connecting-to-data/connecting-to-data#connecting-to-hive-data
Related
I work in a large enterprise and have a project to build some custom automated dashboards for our IT department, the small amount of data needs to be fetched only from the REST API endpoints. This process needs to be fully automated and there is not enough time to build a custom API wrapper. For this approach I was going to use Apache Airflow + Apache Superset tools. I have been googling for a couple of days for more easier open source solution than the Apache Airflow to move data from the REST API endpoints to visualize it in Superset. Please share your experience what would you choose instead of the Apache Airflow?
I chose to go with fhe following solution:
Apache Airflow + PostgreSQL + Grafana (instead of a Superset, because in Grafana you can actually create a drill-down option using a workaround)
I want to use spark-redshift libraries for writing data from AWS S3 to AWS Redshift using the following code.
Before using this, I would like to know whether spark-redshift libraries are open-source/free to use or it has to be licensed via Databricks.
val query="delete from emp where empno=7790"
//Write data to RedShift
mydf.coalesce(1).write.
format("com.databricks.spark.redshift")
.option("url",redShiftUrl)
.option("dbtable","emp")
.option("tempdir",s3dir)
.option("forward_spark_s3_credentials",true)
.option("preactions",query).
mode("append").
save()
spark-redshift is a package maintained by Databricks, with community contributions from SwiftKey and other companies. It is free to use no license needed.
I’m a student and i’m working on my last year project, the project is about Data warhousing, BI, etc...
So Im asked to work with Apache Kylin
I did some researchs about it, learned some
And I looked for if it is possible to use a PostgreSQL as Data warehouse and make it communicate with Apache Kylin to build cubes
But found nothing...
So would you please answer to my following question:
Is it possible to make the apache kylin communicate with a postgreSQL DWH?
And if there is some hidden documentations about it would you please share it?
Time is running guys and i really appreciate your answers and guides
Thanks in advance.
Khalil
It's doable. Kylin provides data source adapter for JDBC data sources. PostgreSQL could be one of the data source adapters. MySQL is supported by default. You could check this link to learn more: http://kylin.apache.org/development/datasource_sdk.html
I am planning to use orientdb in production using the jdbc drive so i need confirm some points
is jdbc driver can give all the orientdb Features like (transaction and links ...etc) or using the the java api is the best choice
I noticed that you have spring data implementation in the orientdb github is it ready to use in the production
At this link a discussion on the issue that you wrote.
in general, JDBC driver supports only a subset of OrientDB, only the part you can use with commands.
If you're a Java developer, I suggest you to use the Java Graph API: http://orientdb.com/docs/last/Graph-Database-Tinkerpop.html
Is there any way to do the subject?
I mean is it possible in theory? Any pluging for the JasperReports Server available?
Or maybe there are some other reporting tools that could make something similar job like JasperReports Server?
Can not find any info on google.
Yes, there's a plugin for Cassandra, see: http://jasperforge.org/projects/bigdatareportingfornosqlandhadoop