Notice: This Wiki is now read only and edits are no longer possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.
SMILA/Documentation/JdbcLoggingPipelet
Contents
Bundle: org.eclipse.smila.jdbc
org.eclipse.smila.jdbc.JdbcLoggingPipelet
Description
The JdbcLoggingPipelet logs a given statement string (= PreparedStatement) into a database via JDBC for each processed record. The PreparedStatement typically has parameters '?', these are filled with the values which are referenced by the valuePaths parameter. The valuePaths parameter contains a list of strings, each string contains a path to a (sub-)attribute in the currently processed record's metadata.
Configuration
Property | Type | Read Type | Required | Description |
---|---|---|---|---|
dbUrl | String | runtime | yes | The (JDBC driver) dependent URL which is used to connect to the database. |
dbProps | Map | runtime | yes | Database connection properties, e.g. 'user' and 'password'. |
stmt | String | runtime | yes | The (Prepared)Statement which is logged to the database, may have parameters. |
valuePaths | String (multi) | runtime | no | List of paths which point to the record's metadata (sub)attributes that are used as parameter values in the logged statement. A path is separated by '/'. |
Configuring value paths
The following should be taken into account when specifying a value path:
- If a value path references a single value, this is used for the PreparedStatement.
- If a value path references a sequence of values, (only) the first value of the sequence is used.
- In any other case, the value is set 'null'. (Keep in mind that a 'null' value is valid for a PreparedStatement)
The valuePaths parameter is optional, so it's ok to leave it empty, or to leave it at all. However, this must fit to the given log statement, if no value paths are specified, the statement musn't have parameters.
Processing
The pipelet works asynchronously: That means that it does not execute the SQL requests immediately, but only prepares them and adds them to a in-memory queue which is maintained by the JdbcWriterService in bundle org.eclipse.smila.jdbc. To prevent OutOfMemory errors this queue has a maximum capacity which is configurable in configuration file org.eclipse.smila.jdbc/jdbcwriterservice.properties. The default capacity per database connection is 100:
# capacity of each database queue capacity=100
If the queue is still full when a new request should be added by the pipelet, the request will be discarded and the pipelet will log a warning like this:
2013-09-17 09:03:55,892 (MESZ) WARN [ODEServerImpl-14 ] jdbc.JdbcLoggingPipelet - Error while processing record with id 'record-72-1522951520'. org.eclipse.smila.jdbc.JdbcWriterServiceException: Error queueing data for queue jdbc:oracle:thin:@host:1521:SID at org.eclipse.smila.jdbc.internal.JdbcWriterServiceImpl.write(JdbcWriterServiceImpl.java:83) at org.eclipse.smila.jdbc.JdbcLoggingPipelet.process(JdbcLoggingPipelet.java:54) ... Caused by: java.lang.IllegalStateException: Queue full at java.util.AbstractQueue.add(AbstractQueue.java:98) at java.util.concurrent.ArrayBlockingQueue.add(ArrayBlockingQueue.java:283) at org.eclipse.smila.jdbc.internal.JdbcWriterServiceImpl.write(JdbcWriterServiceImpl.java:81) ... 36 more
Example
The following example shows a sample pipelet configuration and the resulting log statement when logging the given record.
Pipelet configuration:
<proc:configuration> <rec:Val key="dbUrl">jdbc:postgresql://localhost/mydb</rec:Val> <rec:Val key="stmt">INSERT INTO myTable VALUES (?, ?, 100, ?, ?)</rec:Val> <rec:Map key="dbProps"> <rec:Val key="user">Andreas</rec:Val> <rec:Val key="password">top_secret</rec:Val> </rec:Map> <rec:Seq key="valuePaths"> <rec:Val>_recordid</rec:Val> <rec:Val>_parameters/session/id</rec:Val> <rec:Val>Authors</rec:Val> <rec:Val>Size</rec:Val> </rec:Seq> </proc:configuration>
Sample record and resulting logged SQL statement:
{ "_recordid":"web:http://example.org", "_parameters": { "maxCount": 100, "session": { "timestamp": "2012-10-12T14:00:00", "id": 4711 } } "Path": "http://example.org/index.html", "Authors": ["Andreas Weber", "Jürgen Schumacher", "Andreas Schank"] } -> INSERT INTO myTable VALUES ('web:http://example.org', 4711, 100, 'Andreas Weber', null)
org.eclipse.smila.jdbc.JdbcFetcherPipelet
Description
The JdbcFetcherPipelet fetches the results of a given statement string (= PreparedStatement) from a database via JDBC for each processed record. The PreparedStatement typically has parameters '?', these are filled with the values which are referenced by the valuePaths parameter. The valuePaths parameter contains a list of strings, each string contains a path to a (sub-)attribute in the currently processed record's metadata.
Configuration
Property | Type | Read Type | Required | Description |
---|---|---|---|---|
dbUrl | String | runtime | yes | The (JDBC driver) dependent URL which is used to connect to the database. |
dbProps | Map | runtime | yes | Database connection properties, e.g. 'user' and 'password'. |
stmt | String | runtime | yes | The (Prepared)Statement with the SELECT statement to fetch data from the database, may have parameters. |
valuePaths | String (multi) | runtime | no | List of paths which point to the record's metadata (sub)attributes that are used as parameter values in the SELECT statement. A path is separated by '/'. |
Configuring value paths
The following should be taken into account when specifying a value path:
- If a value path references a single value, this is used for the PreparedStatement.
- If a value path references a sequence of values, (only) the first value of the sequence is used.
- In any other case, the value is set 'null'. (Keep in mind that a 'null' value is valid for a PreparedStatement)
The valuePaths parameter is optional, so it's ok to leave it empty, or to leave it at all. However, this must fit to the given SELECT statement, if no value paths are specified, the statement musn't have parameters.
Example
The following example shows a sample pipelet configuration:
Pipelet configuration:
<proc:configuration> <rec:Val key="dbUrl">jdbc:postgresql://localhost/mydb</rec:Val> <rec:Val key="stmt">SELECT COUNT(*) FROM myTable WHERE action=?</rec:Val> <rec:Map key="dbProps"> <rec:Val key="user">Andreas</rec:Val> <rec:Val key="password">top_secret</rec:Val> </rec:Map> <rec:Seq key="valuePaths"> <rec:Val>Action</rec:Val> </rec:Seq> </proc:configuration>
org.eclipse.smila.jdbc.JdbcSelectPipelet
Description
The JdbcSelectPipelet returns the results of a given statement string (= PreparedStatement) from a database via JDBC for each processed record. The PreparedStatement typically has parameters '?', these are filled with the values which are referenced by the valuePaths parameter. The valuePaths parameter contains a list of strings, each string contains a path to a (sub-)attribute in the currently processed record's metadata.
Configuration
Property | Type | Read Type | Required | Description |
---|---|---|---|---|
dbUrl | String | runtime | yes | The (JDBC driver) dependent URL which is used to connect to the database. |
dbProps | Map | runtime | yes | Database connection properties, e.g. 'user' and 'password'. |
stmt | String | runtime | yes | The (Prepared)Statement with the SELECT statement to fetch data from the database, may have parameters. |
valuePaths | String (multi) | runtime | no | List of paths which point to the record's metadata (sub)attributes that are used as parameter values in the SELECT statement. A path is separated by '/'. |
result | String ('single' / 'multi') | runtime | no | whether resulting rows selected from the database should be returned in a new 'records' section of the input record ('single'), or as multiple records - i.e. one record per row ('multi'). |
Configuring value paths
Same way as for JdbcFetcherPipelet (see above).
Examples
The following example shows sample pipelet results:
Single input record:
{ "recordid": "inputId", "att1": "value" }
Result ('single') when selecting two rows:
{ "recordid": "inputId", "att1": "value", "records": [ {"DB-COLUMN-1":"value1", "DB-COLUMN-2": "value2"}, {"DB-COLUMN-1":"value3", "DB-COLUMN-2": "value4"} ] }
Result ('multi') when selecting two rows:
{ "recordid": "inputId#1", "DB-COLUMN-1":"value1", "DB-COLUMN-2": "value2" } { "recordid": "inputId#2", "DB-COLUMN-1":"value3", "DB-COLUMN-2": "value4" }
Multiple input records:
{ "recordid": "inputId1", "att1": "value1", "_parameters": { "stmt": "SELECT * FROM t WHERE DB-COLUMN-1 = 'v1'" } } { "recordid": "inputId2", "att1": "value2", "_parameters": { "stmt": "SELECT * FROM t WHERE DB-COLUMN-1 = 'v2'" } }
Sample Result ('single'):
{ "recordid": "inputId1", "att1": "value1", "_parameters": { "stmt": "SELECT * FROM t WHERE DB-COLUMN-1 = 'v1'" }, "records": [ {"DB-COLUMN-1":"v1", "DB-COLUMN-2": "v2"}, {"DB-COLUMN-1":"v1", "DB-COLUMN-2": "v3"} ] } { "recordid": "inputId2", "att1": "value2", "_parameters": { "stmt": "SELECT * FROM t WHERE DB-COLUMN-1 = 'v2'" }, "records": [ {"DB-COLUMN-1":"v2", "DB-COLUMN-2": "v4"} ] }
Sample Result ('multi'):
{ "recordid": "inputId1#1", "DB-COLUMN-1":"v1", "DB-COLUMN-2": "v2" } { "recordid": "inputId1#2", "DB-COLUMN-1":"v1", "DB-COLUMN-2": "v3" } { "recordid": "inputId2#1", "DB-COLUMN-1":"v2", "DB-COLUMN-2": "v4" }
Adding JDBC Drivers
By default SMILA includes only JDBC drivers for Derby. If you want to access other databases then you have to provide according JDBC drivers. Have a look here to learn how to add JDBC drivers to SMILA.