Details
-
Task
-
Status: Closed (View Workflow)
-
Minor
-
Resolution: Won't Fix
-
1.5.7
-
None
Description
Install Spark 2.0 using docker for simplicity following this image:
https://hub.docker.com/r/singularities/spark/
copy the sample docker-compose.yml locally and run:
docker-compose up
After a while the master and worker will both start and logging stops. Now:
docker exec -it sparkdocker_master_1 bash
|
curl -O https://downloads.mariadb.com/Connectors/java/connector-java-1.5.7/mariadb-java-client-1.5.7.jar
|
pyspark --driver-class-path mariadb-java-client-1.5.7.jar --jars mariadb-java-client-1.5.7.jar
|
 |
from pyspark.sql import DataFrameReader
|
url = 'jdbc:mariadb://172.21.21.2:3306/test?useServerPrepStmts=false'
|
properties = {'user': 'root', 'driver': 'org.mariadb.jdbc.Driver', 'useServerPrepStmts':'false'}
|
df = DataFrameReader(sqlContext).jdbc(url='%s' % url, table='tmp1', properties=properties)
|
df.show()
|
the df.show will result in a stack trace with the error:
Caused by: java.sql.SQLException: Out of range value for column 'i' : value i is not in Integer range
|
at org.mariadb.jdbc.internal.queryresults.resultset.MariaSelectResultSet.parseInt(MariaSelectResultSet.java:3233)
|
at org.mariadb.jdbc.internal.queryresults.resultset.MariaSelectResultSet.getInt(MariaSelectResultSet.java:992)
|
at org.mariadb.jdbc.internal.queryresults.resultset.MariaSelectResultSet.getInt(MariaSelectResultSet.java:969)
|
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.getNext(JDBCRDD.scala:446)
|
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.hasNext(JDBCRDD.scala:544)
|
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
|
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
|
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370)
|
at org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:246)
|
at org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:240)
|
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:803)
|
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:803)
|
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
|
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
|
at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
|
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
|
at org.apache.spark.scheduler.Task.run(Task.scala:86)
|
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
|
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
|
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
|
... 1 more
|
If instead i use the mysql connector (mysql-connector-java-5.1.40.tar.gz) it works.
This docker image use java 8, however i modified this to use Java 7 and rebuilt and the error still happens so not a java 8 specific issue.
Table definition is very simple:
create table tmp1 (i int, ip int);
insert into tmp1 values (1,1);
Attachments
Issue Links
- includes
-
CONJ-423 Permit to have MySQL driver and MariaDB driver in same classpath
- Closed