Recent versions of ors-amarisoft are able to collect and send xlog data to any fluentd/fluentbit compatible data lake using a common file format.
This was achieved by adding a new parameter to SlapOS Amarisoft software release:
"xlog_fluentbit_forward_host": {
"title": "Address to Forward Xlog by Fluenbit",
"description": "Address of Remote Fluentd or Fluentbit Server to Forward Xlog",
"type": "string"
},
"xlog_fluentbit_forward_port": {
"title": "Port to Forward Xlog by Fluentbit",
"description": "Optional Port of Remote Fluentd or Fluentbit Server to Forward Xlog",
"type": "string"
},
"xlog_fluentbit_forward_shared_key": {
"title": "Shared Key to Forward Xlog by Fluentbit",
"description": "Secret Key Shared with Remote Fluentd or Fluentbit Server for Authentication when Forwarding Xlog",
"type": "string"
}
Fluentbit configuration in SlapIOS Amarisoft software:
[SERVICE]
flush 5
[INPUT]
name tail
path ${:logfile} # get enb.xlog automatically with the setting
Read_from_Head True
[OUTPUT]
name forward
match *
Host ${:forward-host} # the value of xlog_fluentbit_forward_host
{%- if slapparameter_dict.get('xlog_fluentbit_forward_port') %}
Port ${:forward-port} # the value of xlog_fluentbit_forward_port
{%- endif %}
Shared_Key ${:forward-shared-key} # the value of xlog_fluentbit_forward_shared_key
Self_Hostname ${:forward-self-hostname} # get self-hostname automatically with the setting
tls on
tls.verify off
Selected operation data is collected by the xamari process running nearby Amarisoft enb process on the BBU. Operation data is then converted into a common format (enb.xlog) and sent by fluentbit to a remote server for further processing such as KPI calculation.
A complete example is explained in a notebook. It is a first example of standardisation of operation data of a vRAN infrastructure. A proof-of-concept implementation of a data lake capable of generating 3GPP KPIs from xlog files has been deployed with Wendelin big data hub.
Further standardisation may be needed for environment (ex. temperature). For resource usage (ex. CPU), SlapOS already proposes a standard file format of resource usage (RAM, CPU, disk) per process. However, those file formats are mutually inconsistent, making data sharing in a common data space difficult or impossible.
Ideally, we need a common way to share time sequences, no matter their properties. Possible candidates include:
- HDF5 (exhaustive feature set, limited adoption)
- Parquet (reduced feature set, well adopted)
- ndarray (minimal feature set but native support in python, JS, C, C++, FORTRAN, etc. and widely adopted)
- JSONL (a good replacement to CSV with schemas and native support in python, JS, etc.)
- yet to be defined standard file format from IDSA as part of DIN SPEC 27070
- etc.
Selecting a common standard will be part of next development steps of SlapOS.
References