Skip to content

Commit

Permalink
Add a Full Script Support example using Pandas
Browse files Browse the repository at this point in the history
  • Loading branch information
josefinestal committed Dec 28, 2017
1 parent 8b8f52c commit f27f0e3
Show file tree
Hide file tree
Showing 9 changed files with 475 additions and 8 deletions.
2 changes: 2 additions & 0 deletions examples/python/FullScriptSupport_Pandas/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
logs/
__pycache__/
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
#! /usr/bin/env python3
import argparse
import logging
import logging.config
import os
import sys
import time
from concurrent import futures

# Add Generated folder to module path.
PARENT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
sys.path.append(os.path.join(PARENT_DIR, 'Generated'))

import ServerSideExtension_pb2 as SSE
import grpc
from ScriptEval_scriptPandas import ScriptEval

_ONE_DAY_IN_SECONDS = 60 * 60 * 24


class ExtensionService(SSE.ConnectorServicer):
"""
SSE-plugin with support for full script functionality.
"""

def __init__(self):
"""
Class initializer.
:param funcdef_file: a function definition JSON file
"""
self.ScriptEval = ScriptEval()
os.makedirs('logs', exist_ok=True)
log_file = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'logger.config')
logging.config.fileConfig(log_file)
logging.info('Logging enabled')

"""
Implementation of rpc functions.
"""

def GetCapabilities(self, request, context):
"""
Get capabilities.
Note that either request or context is used in the implementation of this method, but still added as
parameters. The reason is that gRPC always sends both when making a function call and therefore we must include
them to avoid error messages regarding too many parameters provided from the client.
:param request: the request, not used in this method.
:param context: the context, not used in this method.
:return: the capabilities.
"""
logging.info('GetCapabilities')
# Create an instance of the Capabilities grpc message
# Enable(or disable) script evaluation
# Set values for pluginIdentifier and pluginVersion
capabilities = SSE.Capabilities(allowScript=True,
pluginIdentifier='Full Script Support using Pandas- Qlik',
pluginVersion='v1.0.0')

return capabilities

def EvaluateScript(self, request, context):
"""
This plugin supports full script functionality, that is, all function types and all data types.
:param request:
:param context:
:return:
"""
# Parse header for script request
metadata = dict(context.invocation_metadata())
header = SSE.ScriptRequestHeader()
header.ParseFromString(metadata['qlik-scriptrequestheader-bin'])

return self.ScriptEval.EvaluateScript(header, request, context)

"""
Implementation of the Server connecting to gRPC.
"""

def Serve(self, port, pem_dir):
"""
Sets up the gRPC Server with insecure connection on port
:param port: port to listen on.
:param pem_dir: Directory including certificates
:return: None
"""
# Create gRPC server
server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
SSE.add_ConnectorServicer_to_server(self, server)

if pem_dir:
# Secure connection
with open(os.path.join(pem_dir, 'sse_server_key.pem'), 'rb') as f:
private_key = f.read()
with open(os.path.join(pem_dir, 'sse_server_cert.pem'), 'rb') as f:
cert_chain = f.read()
with open(os.path.join(pem_dir, 'root_cert.pem'), 'rb') as f:
root_cert = f.read()
credentials = grpc.ssl_server_credentials([(private_key, cert_chain)], root_cert, True)
server.add_secure_port('[::]:{}'.format(port), credentials)
logging.info('*** Running server in secure mode on port: {} ***'.format(port))
else:
# Insecure connection
server.add_insecure_port('[::]:{}'.format(port))
logging.info('*** Running server in insecure mode on port: {} ***'.format(port))

# Start gRPC server
server.start()
try:
while True:
time.sleep(_ONE_DAY_IN_SECONDS)
except KeyboardInterrupt:
server.stop(0)


if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--port', nargs='?', default='50056')
parser.add_argument('--pem_dir', nargs='?')
args = parser.parse_args()

calc = ExtensionService()
calc.Serve(args.port, args.pem_dir)
45 changes: 45 additions & 0 deletions examples/python/FullScriptSupport_Pandas/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Example: Full script support using Pandas
This example plugin includes support for all script functionality and is based on the original [Full Script Support](../FullScriptSupport/README.md) Python example. The implementation of this plugin differs mainly in the use of the Pandas library. In addition, the data received from Qlik is now saved to a Pandas data frame. In this example, we use the `exec` method to evaluate the script rather than the `eval` method, as we did in the original example plugin. This change makes it possible to pass a multiline script from Qlik.

## Content
* [Implementation](#implementation)
* [Parameters sent from Qlik](#parameters-sent-from-qlik)
* [TableDescription](#tabledescription)
* [Result](#result)
* [Qlik Documents](#qlik-documents)
* [Run the Example!](#run-the-example)

## Implementation
We have tried to provide well documented code that you can easily follow along with. If something is unclear, please let us know so that we can update and improve our documentation. In this file, we guide you through a few key points in the implementation that are worth clarifying.

### Parameters sent from Qlik
The parameters sent from Qlik are now stored in a `pandas.DataFrame` object called `q`. The names of the parameters, and hence the column names of `q`, are set to the names sent from Qlik in the _ScriptRequestHeader_. For instance if you send a parameter called `Foo` in Qlik, you will reach the parameter by writing `q.Foo` or `q["Foo"]` in the script.

If the parameter is of type _Dual_ the plugin will create two additional columns in the `q` data frame, with the string and numerical representation. The column names will have the base as the parameter name but will end with '_str' and '_num' respectively. For example, a parameter called `Bar` with datatype _Dual_ will result in three columns in `q`: `Bar`, `Bar_str` and `Bar_num`. `Bar` will contain strings and numerics, `Bar_str` will contain only strings and `Bar_num` only numerics.

### TableDescription
In the load script, when using the `Load ... Extension ...` syntax you can create the `TableDescription` message within the script. This can be useful if, for example, you want to name, set tags for, or change the datatype of the fields you are sending back to Qlik. Read more about what metadata can be included in the `TableDescription` in the [SSE_Protocol.md](../../../docs/SSE_Protocol.md#qlik.sse.TableDescription).

An instance of the `TableDescription` message is available from the script by the name `table`. To that instance you can add metadata according to the protocol. A few simple examples:

- `table.name = "Table1"` sets the table name to be _Table1_
- `table.fields.add(name="firstField", dataType=1, tags=["tag1", "tag2"])` adds a _numeric_ field called _firstField_ with the tags _tag1_ and _tag2_.

Note that if a `TableDescription` is sent, the number of fields in the message must match the number of fields of data sent back to Qlik.

### Result
With the change to using the `exec` method to evaluate the script, there are some changes regarding what's possible to write in the script. See the Python documentation of `exec` [here](https://docs.python.org/3/library/functions.html#exec). One change is that the `exec` method does not return anything. We must therefore set the result to a specific variable, which we have chosen to call `qResult`. If nothing was set to the variable, no data will be returned to Qlik. Note that `qResult` is not required to be a Pandas data frame.

For example, if you want to return the same parameters as received from Qlik you can use the script `'qResult = q.values'`. Note that if I wrote `'qResult = q'` the entire data frame, including the column names as the first row, will be passed along to where the duals and BundledRows are created. This could result in an error if the column names are strings and you are supposed to return numerics.


## Qlik documents
We provide an example Qlik Sense document (SSE_Full_Script_Support_pandas.qvf). It's the same as the original Full Script Support example, but with modified scripts to work with the Pandas implementation and the use of `exec`.

In the load script there is an example of the `Load ... Extension ...` syntax for a table load using SSE. There are also examples of using SSE expressions within a regular load. In that case the SSE call is treated as a scalar or aggregation and only one column can be returned.

There are a number of examples in the sheets of how to retrieve the data from the script, and how to make simple calculations.


## Run the example!
To run this example, follow the instructions in [Getting started with the Python examples](../GetStarted.md).
33 changes: 33 additions & 0 deletions examples/python/FullScriptSupport_Pandas/SSEData_scriptPandas.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
from enum import Enum


class ArgType(Enum):
"""
Represents data types that can be used
as arguments in different script functions.
"""
Undefined = -1
Empty = 0
String = 1
Numeric = 2
Mixed = 3


class ReturnType(Enum):
"""
Represents return types that can
be used in script evaluation.
"""
Undefined = -1
String = 0
Numeric = 1
Dual = 2


class FunctionType(Enum):
"""
Represents function types.
"""
Scalar = 0
Aggregation = 1
Tensor = 2
Binary file not shown.
Loading

0 comments on commit f27f0e3

Please sign in to comment.