Radoop Release Notes - Altair RapidMiner Documentation

You are viewing the RapidMiner Radoop documentation for version 7.6 - Check here for latest version

Enhancements and bug fixes

The following improvements are part of RapidMiner Radoop 7.6.

Added support for both Standard and Premium Azure HDInsight 3.5. It is recommended to use the Import from Cluster Manager option to create the connection from Ambari directly
Container reuse is now supported on Hive-on-Tez besides Hive-on-Spark
HiveServer2 High Availability (using ZooKeeper's service discovery) is now supported
Dynamic Container Pool size now adapts to changing cluster size
Hadoop client libraries are upgraded to 2.8.1
SparkRM and Single Process Pushdown now also logs RapidMiner initialization, so issues with e.g. extensions can be investigated
After a connection Import from Cluster Manager, the JDBC URL Postfix is now populated with the necessary value, if HiveServer2 transport mode is set to http
Spark job test now reports if the remote Spark Assembly is incompatible with the chosen Apache Spark version (relevant for CDH 5.11, 5.12 and potentially other versions)
Increased default timeout value for DataNode networking test from 30 to 60
Sensitive property list for Extract Logs can be customized (besides built-in anonymization)

BUGFIX: SparkRM and Single Process Pushdown no longer fails when input data set in PARQUET format contains complex data types (array, struct, map, nested), or in TEXTFILE format contains array, stuct or map data types
BUGFIX: SparkRM no longer throws StackOverflow error if bootstrapping is used and number of bootstrap is larger than 250
BUGFIX: Advanced Hive Parameters are no longer applied multiple times, thus leading to better performance
BUGFIX: Automatic temporary data cleaning service is no longer started multiple times concurrently
BUGFIX: During Studio or Server shutdown, temporary data cleaning threads are no longer cancelled prematurely
BUGFIX: SparkConf description in Spark Script templates are fixed for both Python and R
BUGFIX: With enabled Hive on Spark container reuse, the container pool size can no longer decrease to 0 because of the resource settings, it is always at least 1