Operators
Complete reference for all Texera operators organized by category
Quick Links
Operator Categories
1 - Data Input
Operators in the Data Input category
Home > Data Input
Operators
Total: 8 operators
1.1 - Arrow File Scan
Scan data from an Arrow file
Home > Data Input
| Property | Requirement | Type | Default | Description |
|---|
| File | ✓ | String | - | |
| Limit | | Integer | - | Max output count |
| Offset | | Integer | - | Starting point of output |
Output Ports
1.2 - CSV File Scan
Scan data from a CSV file
Home > Data Input
| Property | Requirement | Type | Default | Description |
|---|
| File | ✓ | String | - | |
| File Encoding | ✓ | UTF_8, UTF_16, US_ASCII | UTF_8 | Decoding charset to use on input |
| Limit | | Integer | - | Max output count |
| Offset | | Integer | - | Starting point of output |
| Delimiter | | String | , | Delimiter to separate each line into fields |
| Header | | Boolean | true | Whether the CSV file contains a header line |
Output Ports
1.3 - CSVOld File Scan
Scan data from a CSVOld file
Home > Data Input
| Property | Requirement | Type | Default | Description |
|---|
| File | ✓ | String | - | |
| File Encoding | ✓ | UTF_8, UTF_16, US_ASCII | UTF_8 | Decoding charset to use on input |
| Limit | | Integer | - | Max output count |
| Offset | | Integer | - | Starting point of output |
| Delimiter | | String | , | Delimiter to separate each line into fields |
| Header | | Boolean | true | Whether the CSV file contains a header line |
Output Ports
1.4 - File Lister
Select a dataset version and output one filename tuple per file
Home > Data Input
| Property | Requirement | Type | Default | Description |
|---|
| Dataset | ✓ | String | - | |
Output Ports
1.5 - File Scan
Scan data from a file
Home > Data Input
| Property | Requirement | Type | Default | Description |
|---|
| File | ✓ | String | - | |
| Encoding | ✓ | UTF_8, UTF_16, US_ASCII | UTF_8 | |
| Extract | | Boolean | false | |
| ↳ Include Filename | | Boolean | false | |
| Attribute Type | ✓ | string, single string, integer, long, double, boolean, timestamp, binary, large binary | string | |
| Attribute Name | ✓ | String | line | |
| Limit | | Integer | - | |
| Offset | | Integer | - | |
Output Ports
1.6 - File Scan From Input
Scan data from file paths provided by input tuples
Home > Data Input
| Property | Requirement | Type | Default | Description |
|---|
| Encoding | ✓ | UTF_8, UTF_16, US_ASCII | UTF_8 | |
| Extract | | Boolean | false | |
| Include Filename | | Boolean | false | |
| Attribute Type | ✓ | string, single string, integer, long, double, boolean, timestamp, binary, large binary | string | |
| Attribute Name | ✓ | String | line | |
| Limit | | Integer | - | |
| Offset | | Integer | - | |
Output Ports
1.7 - JSONL File Scan
Scan data from a JSONL file
Home > Data Input
| Property | Requirement | Type | Default | Description |
|---|
| File | ✓ | String | - | |
| File Encoding | ✓ | UTF_8, UTF_16, US_ASCII | UTF_8 | Decoding charset to use on input |
| Limit | | Integer | - | Max output count |
| Offset | | Integer | - | Starting point of output |
| Flatten | ✓ | Boolean | false | Flatten nested objects and arrays |
Output Ports
1.8 - Text Input
Source data from manually inputted text
Home > Data Input
| Property | Requirement | Type | Default | Description |
|---|
| Text | ✓ | String | - | |
| Attribute Type | ✓ | string, single string, integer, long, double, boolean, timestamp, binary, large binary | string | |
| Attribute Name | ✓ | String | line | |
| Limit | | Integer | - | |
| Offset | | Integer | - | |
Output Ports
2 - Database Connector
Operators in the Database Connector category
Home > Database Connector
Operators
Total: 3 operators
2.1 - AsterixDB Source
Read data from a AsterixDB instance
Home > Database Connector
| Property | Requirement | Type | Default | Description |
|---|
| Host | ✓ | String | - | |
| Port | ✓ | String | default | A port number or ‘default’ |
| Database | ✓ | String | - | |
| Table Name | ✓ | String | - | |
| Limit | | Long | - | Max output count |
| Offset | | Long | - | Starting point of output |
| Keyword Search? | | Boolean | false | |
| ↳ Keyword Search Column | | String | - | |
| ↳ Keywords to Search | | String | - | “[‘hello’, ‘world’], {‘mode’:‘any’}” OR "[‘hello’, ‘world’], {‘mode’:‘all’}" |
| Progressive? | | Boolean | false | |
| ↳ Batch by Column | | String | - | |
| ↳ Min | | String | auto | |
| ↳ Max | | String | auto | |
| ↳ Batch by Interval | | Long | 1000000000 | |
| Geo Search? | | Boolean | false | |
| ↳ Geo Search By Columns | | List | - | Column(s) to check if any of them is in the bounding box below |
| ↳ Geo Search Bounding Box | | List | - | At least 2 entries should be provided to form a bounding box. format of each entry: long, lat |
| Regex Search? | | Boolean | false | |
| ↳ Regex Search By Column | | String | - | |
| ↳ Regex to Search | | String | - | |
| Filter Condition? | | Boolean | false | |
| ↳ Predicates | | List | - | Multiple predicates in OR |
| ↳ Attribute | ✓ | String | - | |
| ↳ Condition | ✓ | =, >, >=, <, <=, !=, is null, is not null | - | |
| ↳ Value | | String | - | |
Output Ports
2.2 - MySQL Source
Read data from a MySQL instance
Home > Database Connector
| Property | Requirement | Type | Default | Description |
|---|
| Host | ✓ | String | - | |
| Port | ✓ | String | default | A port number or ‘default’ |
| Database | ✓ | String | - | |
| Table Name | ✓ | String | - | |
| Username | ✓ | String | - | |
| Password | ✓ | String | - | |
| Limit | | Long | - | Max output count |
| Offset | | Long | - | Starting point of output |
| Keyword Search? | | Boolean | false | |
| ↳ Keyword Search Column | | String | - | |
| ↳ Keywords to Search | | String | - | |
| Progressive? | | Boolean | false | |
| ↳ Batch by Column | | String | - | |
| ↳ Min | | String | auto | |
| ↳ Max | | String | auto | |
| ↳ Batch by Interval | | Long | 1000000000 | |
Output Ports
2.3 - PostgreSQL Source
Read data from a PostgreSQL instance
Home > Database Connector
| Property | Requirement | Type | Default | Description |
|---|
| Host | ✓ | String | - | |
| Port | ✓ | String | default | A port number or ‘default’ |
| Database | ✓ | String | - | |
| Table Name | ✓ | String | - | |
| Username | ✓ | String | - | |
| Password | ✓ | String | - | |
| Limit | | Long | - | Max output count |
| Offset | | Long | - | Starting point of output |
| Keyword Search? | | Boolean | false | |
| ↳ Keyword Search Column | | String | - | |
| ↳ Keywords to Search | | String | - | E.g. ‘sore & throat’ for AND; ‘sore’, ’throat’ for OR. See official postgres documents for details |
| Progressive? | | Boolean | false | |
| ↳ Batch by Column | | String | - | |
| ↳ Min | | String | auto | |
| ↳ Max | | String | auto | |
| ↳ Batch by Interval | | Long | 1000000000 | |
Output Ports
3 - Search
Operators in the Search category
Home > Search
Operators
Total: 4 operators
3.1 - Dictionary matcher
Matches tuples if they appear in a given dictionary
Home > Search
| Property | Requirement | Type | Default | Description |
|---|
| Dictionary | ✓ | String | - | Dictionary values separated by a comma |
| Attribute | ✓ | String | - | Column name to match |
| Result Attribute | ✓ | String | matched | Column name of the matching result |
| Matching Type | ✓ | Scan, Substring, Conjunction | - | |
Output Ports
3.2 - Keyword Search
Search for keyword(s) in a string column
Home > Search
| Property | Requirement | Type | Default | Description |
|---|
| attribute | ✓ | String | - | Column to search keyword on |
| keywords | ✓ | String | - | Keywords |
Output Ports
3.3 - Regular Expression
Search a regular expression in a string column
Home > Search
| Property | Requirement | Type | Default | Description |
|---|
| Case Insensitive | | Boolean | false | Regex match is case sensitive |
| Attribute | ✓ | String | - | Column to search regex on |
| Regex | ✓ | String | - | Regular expression |
Output Ports
3.4 - Substring Search
Search for Substring(s) in a string column
Home > Search
| Property | Requirement | Type | Default | Description |
|---|
| attribute | ✓ | String | - | Column to search substring on |
| Substring | ✓ | String | - | Substring |
| Case Sensitive | ✓ | Boolean | false | Whether the substring match is case sensitive |
Output Ports
4 - Data Cleaning
Operators in the Data Cleaning category
Home > Data Cleaning
Subcategories
Operators
| Operator | Description |
|---|
| Distinct | Remove duplicate tuples |
| Filter | Performs a filter operation using OR between multiple predicates |
| Limit | Limit the number of output rows |
| Projection | Keeps or drops the column |
| Type Casting | Cast between types |
Total: 5 operators
4.1 - Join
Operators in the Join category
Home > Data Cleaning > Join
Operators
| Operator | Description |
|---|
| Cartesian Product | Append fields together to get the cartesian product of two inputs |
| Hash Join | Join two inputs |
| Interval Join | Join two inputs with left table join key in the range of [right table join key, right table join key + constant value] |
Total: 3 operators
4.1.1 - Cartesian Product
Append fields together to get the cartesian product of two inputs
Home > Data Cleaning > Join
Output Ports
4.1.2 - Hash Join
Join two inputs
Home > Data Cleaning > Join
| Property | Requirement | Type | Default | Description |
|---|
| Left Input Attribute | ✓ | String | - | Attribute to be joined on the Left Input |
| Right Input Attribute | ✓ | String | - | Attribute to be joined on the Right Input |
| Join Type | ✓ | inner, left outer, right outer, full outer | inner | Select the join type to execute |
Output Ports
4.1.3 - Interval Join
Join two inputs with left table join key in the range of [right table join key, right table join key + constant value]
Home > Data Cleaning > Join
| Property | Requirement | Type | Default | Description |
|---|
| Interval Constant | ✓ | Long | 10 | Left attri in (right, right + constant) |
| Include Left Bound | ✓ | Boolean | true | Include condition left attri = right attri |
| Include Right Bound | ✓ | Boolean | true | Include condition left attri = right attri |
| Time interval type | | TimeIntervalType | day | Year, Month, Day, Hour, Minute or Second |
| Left Input attr | ✓ | String (integer, long, double, timestamp) | - | Choose one attribute in the left table |
| Right Input attr | ✓ | String | - | Choose one attribute in the right table |
Output Ports
4.2 - Set
Operators in the Set category
Home > Data Cleaning > Set
Operators
| Operator | Description |
|---|
| Difference | Find the set difference of two inputs |
| Intersect | Take the intersect of two inputs |
| SymmetricDifference | Find the symmetric difference (the set of elements which are in either of the sets, but not in their intersection) of two inputs |
| Union | Unions the output rows from multiple input operators |
Total: 4 operators
4.2.1 - Difference
Find the set difference of two inputs
Home > Data Cleaning > Set
Output Ports
4.2.2 - Intersect
Take the intersect of two inputs
Home > Data Cleaning > Set
Output Ports
4.2.3 - SymmetricDifference
Find the symmetric difference (the set of elements which are in either of the sets, but not in their intersection) of two inputs
Home > Data Cleaning > Set
Output Ports
4.2.4 - Union
Unions the output rows from multiple input operators
Home > Data Cleaning > Set
Output Ports
4.3 - Aggregate
Operators in the Aggregate category
Home > Data Cleaning > Aggregate
Operators
| Operator | Description |
|---|
| Aggregate | Calculate different types of aggregation values |
Total: 1 operator
4.3.1 - Aggregate
Calculate different types of aggregation values
Home > Data Cleaning > Aggregate
| Property | Requirement | Type | Default | Description |
|---|
| Aggregations | ✓ | List | - | Multiple aggregation functions (min: 1, aggregations cannot be empty) |
| ↳ Aggregate Func | ✓ | sum, count, average, min, max, concat | - | Sum, count, average, min, max, or concat |
| ↳ Attribute | ✓ | String | - | Column to calculate average value |
| ↳ Result Attribute | ✓ | String | - | Column name of average result |
| Group By Keys | | List | - | Group by columns |
Output Ports
4.4 - Sort
Operators in the Sort category
Home > Data Cleaning > Sort
Operators
| Operator | Description |
|---|
| Sort | Sort based on the columns and sorting methods |
| Sort Partitions | Sort Partitions |
| Stable Merge Sort | Stable per-partition sort with multi-key ordering (incremental stack of sorted buckets) |
Total: 3 operators
4.4.1 - Sort
Sort based on the columns and sorting methods
Home > Data Cleaning > Sort
| Property | Requirement | Type | Default | Description |
|---|
| Attributes | ✓ | List | - | Column to perform sorting on |
| ↳ Attribute | ✓ | String | - | Attribute name to sort by |
| ↳ Sort Preference | ✓ | ASC, DESC | - | Sort preference (ASC or DESC) |
Output Ports
4.4.2 - Sort Partitions
Sort Partitions
Home > Data Cleaning > Sort
| Property | Requirement | Type | Default | Description |
|---|
| Attribute | ✓ | String (integer, long, double) | - | Attribute to sort (must be numerical) |
| Attribute Domain Min | ✓ | Long | 0 | Minimum value of the domain of the attribute |
| Attribute Domain Max | ✓ | Long | 0 | Maximum value of the domain of the attribute |
Output Ports
4.4.3 - Stable Merge Sort
Stable per-partition sort with multi-key ordering (incremental stack of sorted buckets)
Home > Data Cleaning > Sort
| Property | Requirement | Type | Default | Description |
|---|
| Sort Keys | ✓ | List | - | List of attributes to sort by with ordering preferences |
| ↳ Attribute | ✓ | String | - | Attribute name to sort by |
| ↳ Sort Preference | ✓ | ASC, DESC | - | Sort preference (ASC or DESC) |
Output Ports
4.5 - Distinct
Remove duplicate tuples
Home > Data Cleaning
Output Ports
4.6 - Filter
Performs a filter operation using OR between multiple predicates
Home > Data Cleaning
| Property | Requirement | Type | Default | Description |
|---|
| Predicates | ✓ | List | - | Multiple predicates in OR |
| ↳ Attribute | ✓ | String | - | |
| ↳ Condition | ✓ | =, >, >=, <, <=, !=, is null, is not null | - | |
| ↳ Value | | String | - | |
Output Ports
4.7 - Limit
Limit the number of output rows
Home > Data Cleaning
| Property | Requirement | Type | Default | Description |
|---|
| Limit | ✓ | Integer | 0 | The max number of output rows |
Output Ports
4.8 - Projection
Keeps or drops the column
Home > Data Cleaning
| Property | Requirement | Type | Default | Description |
|---|
| Drop Option | ✓ | Boolean | false | Check to drop the selected attributes |
| Attributes | ✓ | List | - | |
| ↳ Attribute | ✓ | String | - | Attribute name in the schema |
| ↳ Alias | | String | - | Renamed attribute name |
Output Ports
4.9 - Type Casting
Cast between types
Home > Data Cleaning
| Property | Requirement | Type | Default | Description |
|---|
| TypeCasting Units | ✓ | List | - | Multiple type castings |
| ↳ Attribute | ✓ | String | - | Attribute for type casting |
| ↳ Cast type | ✓ | string, integer, long, double, boolean, timestamp, binary, large_binary | - | Result type after type casting |
Output Ports
5 - Machine Learning
Operators in the Machine Learning category
Home > Machine Learning
Subcategories
5.1 - Sklearn
Operators in the Sklearn category
Home > Machine Learning > Sklearn
Subcategories
Operators
Total: 28 operators
5.1.1 - Sklearn Training
Operators in the Sklearn Training category
Home > Sklearn > Sklearn Training
Operators
Total: 26 operators
5.1.1.1 - Training: Adaptive Boosting
Sklearn Training: Adaptive Boosting Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.2 - Training: Bagging Training
Sklearn Training: Bagging Training Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.3 - Training: Bernoulli Naive Bayes
Sklearn Training: Bernoulli Naive Bayes Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.4 - Training: Complement Naive Bayes
Sklearn Training: Complement Naive Bayes Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.5 - Training: Decision Tree
Sklearn Training: Decision Tree Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.6 - Training: Dummy Classifier
Sklearn Training: Dummy Classifier Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.7 - Training: Extra Tree
Sklearn Training: Extra Tree Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.8 - Training: Extra Trees
Sklearn Training: Extra Trees Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.9 - Training: Gaussian Naive Bayes
Sklearn Training: Gaussian Naive Bayes Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.10 - Training: Gradient Boosting
Sklearn Training: Gradient Boosting Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.11 - Training: K-nearest Neighbors
Sklearn Training: K-nearest Neighbors Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.12 - Training: Linear Perceptron
Sklearn Training: Linear Perceptron Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.13 - Training: Linear Regression
Sklearn Training: Linear Regression Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.14 - Training: Linear Support Vector Machine
Sklearn Training: Linear Support Vector Machine Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.15 - Training: Logistic Regression
Sklearn Training: Logistic Regression Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.16 - Training: Logistic Regression Cross Validation
Sklearn Training: Logistic Regression Cross Validation Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.17 - Training: Multi-layer Perceptron
Sklearn Training: Multi-layer Perceptron Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.18 - Training: Multinomial Naive Bayes
Sklearn Training: Multinomial Naive Bayes Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.19 - Training: Nearest Centroid
Sklearn Training: Nearest Centroid Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.20 - Training: Passive Aggressive
Sklearn Training: Passive Aggressive Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.21 - Training: Probability Calibration
Sklearn Training: Probability Calibration Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.22 - Training: Random Forest
Sklearn Training: Random Forest Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.23 - Training: Ridge Regression
Sklearn Training: Ridge Regression Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.24 - Training: Ridge Regression Cross Validation
Sklearn Training: Ridge Regression Cross Validation Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.25 - Training: Stochastic Gradient Descent
Sklearn Training: Stochastic Gradient Descent Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.1.26 - Training: Support Vector Machine
Sklearn Training: Support Vector Machine Operator
Home > Machine Learning > Sklearn > Sklearn Training
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.2 - Adaptive Boosting
Sklearn Adaptive Boosting Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.3 - Bagging
Sklearn Bagging Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.4 - Bernoulli Naive Bayes
Sklearn Bernoulli Naive Bayes Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.5 - Complement Naive Bayes
Sklearn Complement Naive Bayes Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.6 - Decision Tree
Sklearn Decision Tree Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.7 - Dummy Classifier
Sklearn Dummy Classifier Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.8 - Extra Tree
Sklearn Extra Tree Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.9 - Extra Trees
Sklearn Extra Trees Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.10 - Gaussian Naive Bayes
Sklearn Gaussian Naive Bayes Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.11 - Gradient Boosting
Sklearn Gradient Boosting Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.12 - K-nearest Neighbors
Sklearn K-nearest Neighbors Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.13 - Linear Perceptron
Sklearn Linear Perceptron Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.14 - Linear Regression
Sklearn Linear Regression Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Degree | ✓ | Integer | 1 | Degree of polynomial function |
Output Ports
5.1.15 - Linear Support Vector Machine
Sklearn Linear Support Vector Machine Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.16 - Logistic Regression
Sklearn Logistic Regression Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.17 - Logistic Regression Cross Validation
Sklearn Logistic Regression Cross Validation Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.18 - Multi-layer Perceptron
Sklearn Multi-layer Perceptron Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.19 - Multinomial Naive Bayes
Sklearn Multinomial Naive Bayes Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.20 - Nearest Centroid
Sklearn Nearest Centroid Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.21 - Passive Aggressive
Sklearn Passive Aggressive Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.22 - Probability Calibration
Sklearn Probability Calibration Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.23 - Random Forest
Sklearn Random Forest Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.24 - Ridge Regression
Sklearn Ridge Regression Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.25 - Ridge Regression Cross Validation
Sklearn Ridge Regression Cross Validation Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.26 - Sklearn Prediction
Sklearn Prediction Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Model Attribute | ✓ | String | model | Attribute corresponding to ML model |
| Output Attribute Name | ✓ | String | prediction | Attribute name of the prediction result |
| Ground Truth Attribute Name To Ignore | | String | - | Attribute name of the ground truth |
Output Ports
5.1.27 - Sklearn Testing
It will generate scorers for Sklearn model
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Regression | ✓ | Boolean | false | Choose to solve a regression task |
| Model Attribute | ✓ | String | model | Attribute corresponding to ML model |
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
Output Ports
5.1.28 - Stochastic Gradient Descent
Sklearn Stochastic Gradient Descent Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.1.29 - Support Vector Machine
Sklearn Support Vector Machine Operator
Home > Machine Learning > Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Target Attribute | ✓ | String | - | Attribute in your dataset corresponding to target |
| Count Vectorizer | | Boolean | false | Convert a collection of text documents to a matrix of token counts |
| ↳ Text Attribute | | String | - | Attribute in your dataset with text to vectorize |
| ↳ Tfidf Transformer | | Boolean | false | Transform a count matrix to a normalized tf or tf-idf representation |
Output Ports
5.2 - Advanced Sklearn
Operators in the Advanced Sklearn category
Home > Machine Learning > Advanced Sklearn
Operators
Total: 4 operators
5.2.1 - KNN Classifier
Sklearn KNN Classifier Operator
Home > Machine Learning > Advanced Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Parameter Setting | ✓ | SklearnAdvancedKNNParameters | - | |
| Ground Truth Attribute Column | ✓ | String | - | Ground truth attribute column |
| Selected Features | ✓ | List | - | Features used to train the model |
Output Ports
5.2.2 - KNN Regressor
Sklearn KNN Regressor Operator
Home > Machine Learning > Advanced Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Parameter Setting | ✓ | SklearnAdvancedKNNParameters | - | |
| Ground Truth Attribute Column | ✓ | String | - | Ground truth attribute column |
| Selected Features | ✓ | List | - | Features used to train the model |
Output Ports
5.2.3 - SVM Classifier
Sklearn SVM Classifier Operator
Home > Machine Learning > Advanced Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Parameter Setting | ✓ | SklearnAdvancedSVCParameters | - | |
| Ground Truth Attribute Column | ✓ | String | - | Ground truth attribute column |
| Selected Features | ✓ | List | - | Features used to train the model |
Output Ports
5.2.4 - SVM Regressor
Sklearn SVM Regressor Operator
Home > Machine Learning > Advanced Sklearn
| Property | Requirement | Type | Default | Description |
|---|
| Parameter Setting | ✓ | SklearnAdvancedSVRParameters | - | |
| Ground Truth Attribute Column | ✓ | String | - | Ground truth attribute column |
| Selected Features | ✓ | List | - | Features used to train the model |
Output Ports
5.3 - Hugging Face
Operators in the Hugging Face category
Home > Machine Learning > Hugging Face
Operators
Total: 4 operators
5.3.1 - Hugging Face Iris Logistic Regression
Predict whether an iris is an Iris-setosa using a pre-trained logistic regression model
Home > Machine Learning > Hugging Face
| Property | Requirement | Type | Default | Description |
|---|
| Petal Length Cm Attribute | ✓ | String | - | Attribute in your dataset corresponding to PetalLengthCm |
| Petal Width Cm Attribute | ✓ | String | - | Attribute in your dataset corresponding to PetalWidthCm |
| Prediction Class Name | ✓ | String | Species_prediction | Output attribute name for the predicted class of species |
| Prediction Probability Name | ✓ | String | Species_probability | Output attribute name for the prediction’s probability of being a Iris-setosa |
Output Ports
5.3.2 - Hugging Face Sentiment Analysis
Analyzing Sentiments with a Twitter-Based Model from Hugging Face
Home > Machine Learning > Hugging Face
| Property | Requirement | Type | Default | Description |
|---|
| Attribute | ✓ | String | - | Column to perform sentiment analysis on |
| Positive Result Attribute | ✓ | String | huggingface_sentiment_positive | Column name of the sentiment analysis result (positive) |
| Neutral Result Attribute | ✓ | String | huggingface_sentiment_neutral | Column name of the sentiment analysis result (neutral) |
| Negative Result Attribute | ✓ | String | huggingface_sentiment_negative | Column name of the sentiment analysis result (negative) |
Output Ports
5.3.3 - Hugging Face Spam Detection
Spam Detection by SMS Spam Detection Model from Hugging Face
Home > Machine Learning > Hugging Face
| Property | Requirement | Type | Default | Description |
|---|
| Attribute | ✓ | String | - | Column to perform spam detection on |
| Spam Result Attribute | ✓ | String | is_spam | Column name of whether spam or not |
| Score Result Attribute | ✓ | String | score | Column name of Probability for classification |
Output Ports
5.3.4 - Hugging Face Text Summarization
Summarize the given text content with a mini2bert pre-trained model from Hugging Face
Home > Machine Learning > Hugging Face
| Property | Requirement | Type | Default | Description |
|---|
| Attribute | ✓ | String | - | Attribute to perform text summarization on |
| Result Attribute Name | | String | summary | Attribute name of the text summary result |
Output Ports
5.4 - Machine Learning General
Operators in the Machine Learning General category
Home > Machine Learning > Machine Learning General
Operators
Total: 1 operator
5.4.1 - Machine Learning Scorer
Scorer for machine learning models
Home > Machine Learning > Machine Learning General
| Property | Requirement | Type | Default | Description |
|---|
| Regression | ✓ | Boolean | false | Choose to solve a regression task |
| ↳ Scorer Functions | | List | - | Select classification tasks metrics |
| ↳ Scorer Functions | | List | - | Select regression tasks metrics |
| Actual Value | ✓ | String | - | Specify the label attribute |
| Predicted Value | ✓ | String | - | Specify the attribute generated by the model |
Output Ports
6 - Utilities
Operators in the Utilities category
Home > Utilities
Operators
| Operator | Description |
|---|
| Random K Sampling | Random sampling with given percentage |
| Reservoir Sampling | Reservoir Sampling with k items being kept randomly |
| Split | Split data to two different ports |
| Unnest String | Unnest the string values in the column separated by a delimiter to multiple values |
Total: 4 operators
6.1 - Random K Sampling
Random sampling with given percentage
Home > Utilities
| Property | Requirement | Type | Default | Description |
|---|
| Random K Sample Percentage | ✓ | Integer | 0 | Random k sampling with given percentage |
Output Ports
6.2 - Reservoir Sampling
Reservoir Sampling with k items being kept randomly
Home > Utilities
| Property | Requirement | Type | Default | Description |
|---|
| Number Of Item Sampled In Reservoir Sampling | ✓ | Integer | 0 | Reservoir sampling with k items being kept randomly |
Output Ports
6.3 - Split
Split data to two different ports
Home > Utilities
| Property | Requirement | Type | Default | Description |
|---|
| Split Percentage | | Integer | 80 | Percentage of data going to the upper port |
| Auto-Generate Seed | | Boolean | true | Shuffle the data based on a random seed |
| ↳ Seed | | Integer | 1 | An int for reproducible output across multiple runs |
Output Ports
6.4 - Unnest String
Unnest the string values in the column separated by a delimiter to multiple values
Home > Utilities
| Property | Requirement | Type | Default | Description |
|---|
| Delimiter | ✓ | String | , | String that separates the data |
| Attribute | ✓ | String | - | Column of the string to unnest |
| Result Attribute | ✓ | String | unnestResult | Column name of the unnest result |
Output Ports
7 - External API
Operators in the External API category
Home > External API
Operators
Total: 4 operators
7.1 - Reddit Search
Search for recent posts with python-wrapped Reddit API, PRAW
Home > External Api
| Property | Requirement | Type | Default | Description |
|---|
| Client Id | ✓ | String | - | Client id that uses to access Reddit API |
| Client Secret | ✓ | String | - | Client secret that uses to access Reddit API |
| Query | ✓ | String | - | Search query |
| Limit | ✓ | Integer | 100 | Up to 1000 |
| Sorting | ✓ | none, controversial, gilded, hot, new, rising, top | none | The sorting method, hot, new, etc |
Output Ports
7.2 - Twitter Full Archive Search API
Retrieve data from Twitter Full Archive Search API
Home > External Api
| Property | Requirement | Type | Default | Description |
|---|
| API Key | ✓ | String | - | |
| API Secret Key | ✓ | String | - | |
| Stop Upon Rate Limit | ✓ | Boolean | false | Stop when hitting rate limit? |
| Search Query | ✓ | String | - | Up to 1024 characters (Limited By Twitter) |
| From Datetime | ✓ | String | 2021-04-01T00:00:00Z | ISO 8601 format |
| To Datetime | ✓ | String | 2021-05-01T00:00:00Z | ISO 8601 format |
| Limit | ✓ | Integer | 100 | Maximum number of tweets to retrieve |
Output Ports
7.3 - Twitter Search API
Retrieve data from Twitter Search API
Home > External Api
| Property | Requirement | Type | Default | Description |
|---|
| API Key | ✓ | String | - | |
| API Secret Key | ✓ | String | - | |
| Stop Upon Rate Limit | ✓ | Boolean | false | Stop when hitting rate limit? |
| Search Query | ✓ | String | - | Up to 1024 characters (Limited by Twitter) |
| Limit | ✓ | Integer | 100 | Maximum number of tweets to retrieve |
Output Ports
7.4 - URL Fetcher
Fetch the content of a single URL
Home > External Api
| Property | Requirement | Type | Default | Description |
|---|
| URL | ✓ | String | - | Only accepts standard URL format |
| Decoding | ✓ | UTF-8, RAW BYTES | - | The decoding method for the url content |
Output Ports
8 - User-defined Functions
Operators in the User-defined Functions category
Home > User-defined Functions
Subcategories
8.1 - Python
Operators in the Python category
Home > User-defined Functions > Python
Operators
Total: 5 operators
8.1.1 - 1-out Python UDF
User-defined function operator in Python script
Home > User Defined Functions > Python
| Property | Requirement | Type | Default | Description |
|---|
| Python script | ✓ | Code (python) | See template below | Input your code here |
| Worker count | ✓ | Integer | 1 | Specify how many parallel workers to launch |
| Columns | | List | - | The columns of the source |
| ↳ Attribute Name | ✓ | String | - | |
| ↳ Attribute Type | ✓ | string, integer, long, double, boolean, timestamp, binary, large_binary | - | |
Default Code Template
Python script
# from pytexera import *
# class GenerateOperator(UDFSourceOperator):
#
# @overrides
#
# def produce(self) -> Iterator[Union[TupleLike, TableLike, None]]:
# yield
Output Ports
8.1.2 - 2-in Python UDF
User-defined function operator in Python script
Home > User Defined Functions > Python
| Property | Requirement | Type | Default | Description |
|---|
| Python script | ✓ | Code (python) | See template below | Input your code here |
| Worker count | ✓ | Integer | 1 | Specify how many parallel workers to launch |
| Retain input columns | ✓ | Boolean | true | Keep the original input columns? |
| Extra output column(s) | | List | - | Name of the newly added output columns that the UDF will produce, if any |
| ↳ Attribute Name | ✓ | String | - | |
| ↳ Attribute Type | ✓ | string, integer, long, double, boolean, timestamp, binary, large_binary | - | |
Default Code Template
Python script
# Choose from the following templates:
#
# from pytexera import *
#
# class ProcessTupleOperator(UDFOperatorV2):
#
# @overrides
# def process_tuple(self, tuple_: Tuple, port: int) -> Iterator[Optional[TupleLike]]:
# yield tuple_
#
# class ProcessBatchOperator(UDFBatchOperator):
# BATCH_SIZE = 10 # must be a positive integer
#
# @overrides
# def process_batch(self, batch: Batch, port: int) -> Iterator[Optional[BatchLike]]:
# yield batch
#
# class ProcessTableOperator(UDFTableOperator):
#
# @overrides
# def process_table(self, table: Table, port: int) -> Iterator[Optional[TableLike]]:
# yield table
Output Ports
8.1.3 - Python Lambda Function
Modify or add a new column with more ease
Home > User Defined Functions > Python
| Property | Requirement | Type | Default | Description |
|---|
| Add/Modify column(s) | | List | - | |
| ↳ Attribute Name | ✓ | String | - | |
| ↳ Expression | ✓ | String | - | |
| ↳ Attribute Type | ✓ | string, integer, long, double, boolean, timestamp, binary, large_binary | - | |
Output Ports
8.1.4 - Python Table Reducer
Reduce Table to Tuple
Home > User Defined Functions > Python
| Property | Requirement | Type | Default | Description |
|---|
| Output columns | | List | - | |
| ↳ Attribute Name | ✓ | String | - | |
| ↳ Expression | ✓ | String | - | |
| ↳ Attribute Type | ✓ | string, integer, long, double, boolean, timestamp, binary, large_binary | - | |
Output Ports
8.1.5 - Python UDF
User-defined function operator in Python script
Home > User Defined Functions > Python
| Property | Requirement | Type | Default | Description |
|---|
| Python script | ✓ | Code (python) | See template below | Input your code here |
| Worker count | ✓ | Integer | 1 | Specify how many parallel workers to launch |
| Retain input columns | ✓ | Boolean | true | Keep the original input columns? |
| Extra output column(s) | | List | - | Name of the newly added output columns that the UDF will produce, if any |
| ↳ Attribute Name | ✓ | String | - | |
| ↳ Attribute Type | ✓ | string, integer, long, double, boolean, timestamp, binary, large_binary | - | |
Default Code Template
Python script
# Choose from the following templates:
#
# from pytexera import *
#
# class ProcessTupleOperator(UDFOperatorV2):
#
# @overrides
# def process_tuple(self, tuple_: Tuple, port: int) -> Iterator[Optional[TupleLike]]:
# yield tuple_
#
# class ProcessBatchOperator(UDFBatchOperator):
# BATCH_SIZE = 10 # must be a positive integer
#
# @overrides
# def process_batch(self, batch: Batch, port: int) -> Iterator[Optional[BatchLike]]:
# yield batch
#
# class ProcessTableOperator(UDFTableOperator):
#
# @overrides
# def process_table(self, table: Table, port: int) -> Iterator[Optional[TableLike]]:
# yield table
Output Ports
8.2 - Java
Operators in the Java category
Home > User-defined Functions > Java
Operators
| Operator | Description |
|---|
| Java UDF | User-defined function operator in Java script |
Total: 1 operator
8.2.1 - Java UDF
User-defined function operator in Java script
Home > User Defined Functions > Java
| Property | Requirement | Type | Default | Description |
|---|
| Java UDF script | ✓ | Code (java) | See template below | Input your code here |
| Worker count | ✓ | Integer | 1 | Specify how many parallel workers to launch |
| Retain input columns | ✓ | Boolean | true | Keep the original input columns? |
| Extra output column(s) | | List | - | Name of the newly added output columns that the UDF will produce, if any |
| ↳ Attribute Name | ✓ | String | - | |
| ↳ Attribute Type | ✓ | string, integer, long, double, boolean, timestamp, binary, large_binary | - | |
Default Code Template
Java UDF script
import org.apache.texera.amber.operator.map.MapOpExec;
import org.apache.texera.amber.core.tuple.Tuple;
import org.apache.texera.amber.core.tuple.TupleLike;
import scala.Function1;
import java.io.Serializable;
public class JavaUDFOpExec extends MapOpExec {
public JavaUDFOpExec () {
this.setMapFunc((Function1<Tuple, TupleLike> & Serializable) this::processTuple);
}
public TupleLike processTuple(Tuple tuple) {
return tuple;
}
}
Output Ports
8.3 - R
Operators in the R category
Home > User-defined Functions > R
Operators
| Operator | Description |
|---|
| R UDF | User-defined function operator in R script |
| 1-out R UDF | User-defined function operator in R script |
Total: 2 operators
8.3.1 - 1-out R UDF
User-defined function operator in R script
Home > User Defined Functions > R
| Property | Requirement | Type | Default | Description |
|---|
| R Source UDF Script | ✓ | Code (r) | See template below | Input your code here |
| Worker count | ✓ | Integer | 1 | Specify how many parallel workers to launch |
| Use Tuple API? | ✓ | Boolean | false | Check this box to use Tuple API, leave unchecked to use Table API |
| Columns | | List | - | The columns of the source |
| ↳ Attribute Name | ✓ | String | - | |
| ↳ Attribute Type | ✓ | string, integer, long, double, boolean, timestamp, binary, large_binary | - | |
Default Code Template
R Source UDF Script
# If using Table API:
# function() {
# return (data.frame(Column_Here = "Value_Here"))
# }
# If using Tuple API:
# library(coro)
# coro::generator(function() {
# yield (list(text= "hello world!"))
# })
Output Ports
8.3.2 - R UDF
User-defined function operator in R script
Home > User Defined Functions > R
| Property | Requirement | Type | Default | Description |
|---|
| R UDF Script | ✓ | Code (r) | See template below | Input your code here |
| Worker count | ✓ | Integer | 1 | Specify how many parallel workers to launch |
| Use Tuple API? | ✓ | Boolean | false | Check this box to use Tuple API, leave unchecked to use Table API |
| Retain input columns | ✓ | Boolean | true | Keep the original input columns? |
| Extra output column(s) | | List | - | Name of the newly added output columns that the UDF will produce, if any |
| ↳ Attribute Name | ✓ | String | - | |
| ↳ Attribute Type | ✓ | string, integer, long, double, boolean, timestamp, binary, large_binary | - | |
Default Code Template
R UDF Script
# If using Table API:
# function(table, port) {
# return (table)
# }
# If using Tuple API:
# library(coro)
# coro::generator(function(tuple, port) {
# yield (tuple)
# })
Output Ports
9 - Visualization
Operators in the Visualization category
Home > Visualization
Subcategories
Operators
| Operator | Description |
|---|
| Nested Table | Visualize Data in a Depth Two Nested Table |
Total: 1 operator
9.1 - Basic
Operators in the Basic category
Home > Visualization > Basic
Operators
| Operator | Description |
|---|
| Bar Chart | Visualize data in a Bar Chart |
| Bubble Chart | A 3D Scatter Plot; Bubbles are graphed using x and y labels, and their sizes determined by a z-value. |
| Dot Plot | Visualize data using a dot plot |
| Dumbbell Plot | Visualize data in a Dumbbell Plot. A dumbbell plot (also known as a lollipop chart) is typically used to compare two distinct values or time points for the same entity. |
| Figure Factory Table | Visualize data in a figure factory table |
| Filled Area Plot | Visualize data in a filled area plot |
| Gantt Chart | A Gantt chart is a type of bar chart that illustrates a project schedule. The chart lists the tasks to be performed on the vertical axis, and time intervals on the horizontal axis. The width of the horizontal bars in the graph shows the duration of each activity. |
| Hierarchy Chart | Visualize data in hierarchy |
| Icicle Chart | Visualize hierarchical data from root to leaves |
| Line Chart | View the result in line chart |
| Pie Chart | Visualize data in a Pie Chart |
| Range Slider | Visualize data in a Range Slider |
| Sankey Diagram | Visualize data using a Sankey diagram |
| Scatter Plot | View the result in a scatterplot |
| Tables Plot | Visualize data in a table chart. |
| Time Series Plot | Visualize trends and patterns over time. |
Total: 16 operators
9.1.1 - Bar Chart
Visualize data in a Bar Chart
Home > Visualization > Basic
| Property | Requirement | Type | Default | Description |
|---|
| Fields | ✓ | String | - | Visualize categorical data in a Bar Chart |
| Category Column | | String | No Selection | Optional - Select a column to Color Code the Categories |
| Horizontal Orientation | | Boolean | false | Orientation Style |
| Pattern | | String | - | Add texture to the chart based on an attribute |
| Value Column | ✓ | String (integer, long, double) | - | The value associated with each category |
Output Ports
9.1.2 - Bubble Chart
A 3D Scatter Plot; Bubbles are graphed using x and y labels, and their sizes determined by a z-value.
Home > Visualization > Basic
| Property | Requirement | Type | Default | Description |
|---|
| X-Column | ✓ | String | - | Data column for the x-axis |
| Y-Column | ✓ | String | - | Data column for the y-axis |
| Z-Column | ✓ | String | - | Data column to determine bubble size |
| Enable Color | | Boolean | false | Colors bubbles using a data column |
| Color-Column | ✓ | String | - | Picks data column to color bubbles with if color is enabled |
Output Ports
9.1.3 - Dot Plot
Visualize data using a dot plot
Home > Visualization > Basic
| Property | Requirement | Type | Default | Description |
|---|
| Count Attribute | ✓ | String | - | The attribute for the counting of the dot plot |
Output Ports
9.1.4 - Dumbbell Plot
Visualize data in a Dumbbell Plot. A dumbbell plot (also known as a lollipop chart) is typically used to compare two distinct values or time points for the same entity.
Home > Visualization > Basic
| Property | Requirement | Type | Default | Description |
|---|
| Category Column Name | ✓ | String | - | The name of the category column |
| Dumbbell Start Value | ✓ | String | - | The start point value of each dumbbell |
| Dumbbell End Value | ✓ | String | - | The end value of each dumbbell |
| Measurement Column Name | ✓ | String (integer, long, double) | - | The name of the measurement column |
| Compared Column Name | ✓ | String | - | The column name that is being compared |
| Dots | | List | - | |
| ↳ Dot Column Value | ✓ | String (integer, long, double) | - | Value for dot axis |
| Show Legends? | | Boolean | false | Whether to show legends in the graph |
Output Ports
9.1.5 - Figure Factory Table
Visualize data in a figure factory table
Home > Visualization > Basic
| Property | Requirement | Type | Default | Description |
|---|
| Font Size | | Double | 12 | Font size of the Figure Factory Table |
| Font Color (Hex Code) | | String | #000000 | Font color of the Figure Factory Table |
| Row Height | | Double | 30 | Row height of the Figure Factory Table |
| Add Attribute | ✓ | List | [1 items] | List of columns to include in the figure factory table |
| ↳ Attribute Name | ✓ | String | - | |
Output Ports
9.1.6 - Filled Area Plot
Visualize data in a filled area plot
Home > Visualization > Basic
| Property | Requirement | Type | Default | Description |
|---|
| X-axis Attribute | ✓ | String | - | The attribute for your x-axis |
| Y-axis Attribute | ✓ | String | - | The attribute for your y-axis |
| Line Group | | String | - | The attribute for group of each line |
| Color | | String | - | Choose an attribute to color the plot |
| Split Plot by Line Group | ✓ | Boolean | false | Do you want to split the graph |
| Pattern | | String | - | Add texture to the chart based on an attribute |
Output Ports
9.1.7 - Gantt Chart
A Gantt chart is a type of bar chart that illustrates a project schedule. The chart lists the tasks to be performed on the vertical axis, and time intervals on the horizontal axis. The width of the horizontal bars in the graph shows the duration of each activity.
Home > Visualization > Basic
| Property | Requirement | Type | Default | Description |
|---|
| Pattern | | String | - | Add texture to the chart based on an attribute |
| Start Datetime Column | ✓ | String (timestamp) | - | The start timestamp of the task |
| Finish Datetime Column | ✓ | String (timestamp) | - | The end timestamp of the task |
| Task Column | ✓ | String | - | The name of the task |
| Color Column | | String | - | Column to color tasks |
Output Ports
9.1.8 - Hierarchy Chart
Visualize data in hierarchy
Home > Visualization > Basic
| Property | Requirement | Type | Default | Description |
|---|
| Chart Type | ✓ | treemap, sunburst | - | Treemap or Sunburst |
| Hierarchy Path | ✓ | List | - | Hierarchy of attributes from a higher-level category to lower-level category |
| ↳ Attribute Name | ✓ | String | - | |
| Value Column | ✓ | String (integer, long, double) | - | The value associated with the size of each sector in the chart |
Output Ports
9.1.9 - Icicle Chart
Visualize hierarchical data from root to leaves
Home > Visualization > Basic
| Property | Requirement | Type | Default | Description |
|---|
| Hierarchy Path | ✓ | List | - | Hierarchy of attributes from a root (higher-level category) to leaves (lower-level category) |
| ↳ Attribute Name | ✓ | String | - | |
| Value Column | ✓ | String (integer, long, double) | - | The value associated with the size of each sector in the chart |
Output Ports
9.1.10 - Line Chart
View the result in line chart
Home > Visualization > Basic
| Property | Requirement | Type | Default | Description |
|---|
| Y Label | | String | Y Axis | The label for y axis |
| X Label | | String | X Axis | The label for x axis |
| Lines | ✓ | List | - | |
| ↳ Y Value | ✓ | String | - | Value for y axis |
| ↳ X Value | ✓ | String | - | Value for x axis |
| ↳ Line Mode | ✓ | line, dots, line with dots | line with dots | |
| ↳ Line Name | | String | - | |
| ↳ Line Color | | String | - | Must be a valid CSS color or hex color string |
Output Ports
9.1.11 - Pie Chart
Visualize data in a Pie Chart
Home > Visualization > Basic
| Property | Requirement | Type | Default | Description |
|---|
| Value Column | ✓ | String (integer, long, double) | - | The value associated with slice of pie |
| Name Column | ✓ | String | - | The name of the slice of pie |
Output Ports
9.1.12 - Range Slider
Visualize data in a Range Slider
Home > Visualization > Basic
| Property | Requirement | Type | Default | Description |
|---|
| Y-axis | ✓ | String | - | The name of the column to represent y-axis |
| X-axis | ✓ | String | - | The name of the column to represent the x-axis |
| Handle Duplicates | | Nothing, Mean, Sum | NOTHING | How to handle duplicate values in y-axis |
Output Ports
9.1.13 - Sankey Diagram
Visualize data using a Sankey diagram
Home > Visualization > Basic
| Property | Requirement | Type | Default | Description |
|---|
| Source Attribute | ✓ | String | - | The source node of the Sankey diagram |
| Target Attribute | ✓ | String | - | The target node of the Sankey diagram |
| Value Attribute | ✓ | String | - | The value/volume of the flow between source and target |
Output Ports
9.1.14 - Scatter Plot
View the result in a scatterplot
Home > Visualization > Basic
| Property | Requirement | Type | Default | Description |
|---|
| X-Column | ✓ | String (integer, double) | - | X Column |
| Y-Column | ✓ | String (integer, double) | - | Y Column |
| Alpha Value | | Double | 1.0 | Alpha (opacity) value from 0.0 (transparent) to 1.0 (opaque) |
| Color-Column | | String | - | Dots will be assigned different colors based on their values of this column |
| log scale X | | Boolean | false | Values in X-column is log-scaled |
| log scale Y | | Boolean | false | Values in Y-column is log-scaled |
| Hover column | | String | - | Column value to display when a dot is hovered over |
Output Ports
9.1.15 - Tables Plot
Visualize data in a table chart.
Home > Visualization > Basic
| Property | Requirement | Type | Default | Description |
|---|
| Add Attribute | ✓ | List| - | List of columns to include in the table chart | | ↳ Attribute Name | ✓ | String | - | |
Output Ports9.1.16 - Time Series PlotVisualize trends and patterns over time. Home > Visualization > Basic | Property | Requirement | Type | Default | Description |
|---|
| Time Column | ✓ | String | - | The column containing time/date values (e.g., Date, Timestamp) | | Value Column | ✓ | String | - | The numerical column to plot on the Y-axis (e.g., Sales, Temperature) | | Category Column | | String | No Selection | Optional - A categorical column to create separate lines | | Facet Column | | String | No Selection | Optional - A column to create separate subplots | | Plot Type | ✓ | String | line | Select the type of time series plot (line, area) | | Show Range Slider | | Boolean | false | Display a range slider at the bottom of the plot |
Output Ports9.2 - StatisticalOperators in the Statistical category Home > Visualization > Statistical Operators| Operator | Description |
|---|
| Box/Violin Plot | Visualize data using either a Box Plot or a Violin Plot. Box plots are drawn as a box with a vertical line down the middle which is mean value, and has horizontal lines attached to each side (known as “whiskers”). Violin plots provide more detail by showing a smoothed density curve on each side, and also include a box plot inside for comparison. | | Continuous Error Bands | Visualize error or uncertainty along a continuous line | | Empirical Cumulative Distribution Plot | Visualize the empirical cumulative distribution of a numeric column. | | Histogram | Visualize data in a Histogram Chart | | Histogram2D | Displays a bivariate histogram as a density heatmap | | Scatter Matrix Chart | Visualize datasets in a Scatter Matrix | | Strip Chart | Visualize distribution of data points as a strip plot | | Tree Plot | Visualize hierarchical data as a top-down, interactive, auto-sizing tree |
Total: 8 operators 9.2.1 - Box/Violin PlotVisualize data using either a Box Plot or a Violin Plot. Box plots are drawn as a box with a vertical line down the middle which is mean value, and has horizontal lines attached to each side (known as “whiskers”). Violin plots provide more detail by showing a smoothed density curve on each side, and also include a box plot inside for comparison. Home > Visualization > Statistical | Property | Requirement | Type | Default | Description |
|---|
| Value Column | ✓ | String (integer, long, double) | - | Data column for box plot | | Quartile Method | ✓ | linear, inclusive, exclusive | linear | | | Horizontal Orientation | | Boolean | false | Orientation style | | Violin Plot | | Boolean | false | Check this box to overlay a violin plot on the box plot; otherwise, show only the box plot |
Output Ports9.2.2 - Continuous Error BandsVisualize error or uncertainty along a continuous line Home > Visualization > Statistical | Property | Requirement | Type | Default | Description |
|---|
| X Label | | String | X Axis | Label used for x axis | | Y Label | | String | Y Axis | Label used for y axis | | Bands | ✓ | List | - | | | ↳ Y-Axis Upper Bound | ✓ | String | - | Represents upper bound error of y-values | | ↳ Y-Axis Lower Bound | ✓ | String | - | Represents lower bound error of y-values | | ↳ Fill Color | | String | - | Must be a valid CSS color or hex color string | | ↳ Y Value | ✓ | String | - | Value for y axis | | ↳ X Value | ✓ | String | - | Value for x axis | | ↳ Line Mode | ✓ | line, dots, line with dots | line with dots | | | ↳ Line Name | | String | - | | | ↳ Line Color | | String | - | Must be a valid CSS color or hex color string |
Output Ports9.2.3 - Empirical Cumulative Distribution PlotVisualize the empirical cumulative distribution of a numeric column. Home > Visualization > Statistical | Property | Requirement | Type | Default | Description |
|---|
| Value Column | ✓ | String (integer, long, double) | - | Numeric column used to compute the empirical cumulative distribution | | Color Column | | String | - | Optional column for coloring ECDF lines by group | | Separate By Column | | String | - | Optional column for splitting ECDF plots into subplots | | Y Axis Mode | | String | probability | Display cumulative probability, raw count, or cumulative sum | | CDF Mode | | String | standard | ‘standard’ shows P(X ≤ x), ‘reversed’ shows P(X ≥ x), ‘complementary’ shows 1 - P(X ≤ x) | | Orientation | | String | vertical | Plot ECDF vertically or horizontally | | Show Markers | | Boolean | false | Display sample markers on the ECDF line | | Marginal Plot | | String | none | Optional marginal plot to display alongside the ECDF |
Output Ports9.2.4 - HistogramVisualize data in a Histogram Chart Home > Visualization > Statistical | Property | Requirement | Type | Default | Description |
|---|
| Color Column | | String | - | Column for differentiating data by its value | | SeparateBy Column | | String | - | Column for separating histogram chart by its value | | Distribution Type | | String | - | Distribution type (rug, box, violin) | | Pattern | | String | - | Add texture to the chart based on an attribute | | Value Column | ✓ | String | - | Column for counting values |
Output Ports9.2.5 - Histogram2DDisplays a bivariate histogram as a density heatmap Home > Visualization > Statistical | Property | Requirement | Type | Default | Description |
|---|
| X Column | ✓ | String | - | Numeric column for the X axis bins | | Y Column | ✓ | String | - | Numeric column for the Y axis bins | | X Bins | ✓ | Integer | 10 | Number of bins along the X axis (Default: 10) | | Y Bins | ✓ | Integer | 10 | Number of bins along the Y axis (Default: 10) | | Normalization | | density, probability, percent | density | Type of histogram normalization |
Output Ports9.2.6 - Scatter Matrix ChartVisualize datasets in a Scatter Matrix Home > Visualization > Statistical | Property | Requirement | Type | Default | Description |
|---|
| Selected Attributes | ✓ | List | - | The axes of each scatter plot in the matrix | | Color Column | ✓ | String | - | Column to color points |
Output Ports9.2.7 - Strip ChartVisualize distribution of data points as a strip plot Home > Visualization > Statistical | Property | Requirement | Type | Default | Description |
|---|
| X-Axis Column | ✓ | String | - | Column containing numeric values for the x-axis | | Y-Axis Column | ✓ | String | - | Column containing categorical values for the y-axis | | Color By | | String | - | Optional - Color points by category | | Facet Column | | String | - | Optional - Create separate subplots for each category |
Output Ports9.2.8 - Tree PlotVisualize hierarchical data as a top-down, interactive, auto-sizing tree Home > Visualization > Statistical | Property | Requirement | Type | Default | Description |
|---|
| Edge List Column | ✓ | String | - | Column with [parent, child] pairs |
Output Ports9.3 - ScientificOperators in the Scientific category Home > Visualization > Scientific OperatorsTotal: 14 operators 9.3.1 - Carpet PlotVisualize data in a Carpet Plot Home > Visualization > Scientific | Property | Requirement | Type | Default | Description |
|---|
| First Parameter Axis Column | ✓ | String | - | Column representing the first parameter axis (a) | | Second Parameter Axis Column | ✓ | String | - | Column representing the second parameter axis (b) | | Value Column | ✓ | String | - | Column representing the value at each (a, b) coordinate |
Output Ports9.3.2 - Contour PlotDisplays terrain or gradient variations in a Contour Plot Home > Visualization > Scientific | Property | Requirement | Type | Default | Description |
|---|
| Grid Size | | String | 10 | Grid resolution of the final image | | Connect Gaps | | Boolean | true | Automatically fill in the missing parts | | x | ✓ | String | - | The column name of X-axis | | y | ✓ | String | - | The column name of Y-axis | | z | ✓ | String | - | The column name of color bar | | Coloring Method | | heatmap, lines, none | heatmap | |
Output Ports9.3.3 - DendrogramVisualize data in a Dendrogram Home > Visualization > Scientific | Property | Requirement | Type | Default | Description |
|---|
| Color Threshold | | String | - | Value at which separation of clusters will be made | | Value X Column | ✓ | String | - | The x values of points in dendrogram | | Value Y Column | ✓ | String | - | The y value of points in dendrogram | | Labels | ✓ | String | - | The label of points in dendrogram |
Output Ports9.3.4 - HeatmapVisualize data in a HeatMap Chart Home > Visualization > Scientific | Property | Requirement | Type | Default | Description |
|---|
| Value X Column | ✓ | String | - | The values along the x-axis | | Value Y Column | ✓ | String | - | The values along the y-axis | | Values | ✓ | String | - | The values of the heatmap |
Output Ports9.3.5 - Network GraphVisualize data in a network graph Home > Visualization > Scientific | Property | Requirement | Type | Default | Description |
|---|
| Source Column | ✓ | String | - | Source node for edge in graph | | Destination Column | ✓ | String | - | Destination node for edge in graph | | Title | | String | Network Graph | |
Output Ports9.3.6 - Parallel Coordinates PlotVisualize multivariate data using parallel coordinate axes Home > Visualization > Scientific | Property | Requirement | Type | Default | Description |
|---|
| Dimensions | ✓ | List | - | List of numeric columns to visualize as parallel axes (min: 1, At least one dimension is required) | | Color Column | | String | - | Column used to color or group the lines |
Output Ports9.3.7 - Polar ChartDisplays data points in a polar scatter plot Home > Visualization > Scientific | Property | Requirement | Type | Default | Description |
|---|
| r | ✓ | String | - | The column name for radial values (must be numeric) | | theta | ✓ | String | - | The column name for angular values (must be numeric) |
Output Ports9.3.8 - Quiver PlotVisualize vector data in a Quiver Plot Home > Visualization > Scientific | Property | Requirement | Type | Default | Description |
|---|
| x | ✓ | String | - | Column for the x-coordinate of the starting point | | y | ✓ | String | - | Column for the y-coordinate of the starting point | | u | ✓ | String | - | Column for the vector component in the x-direction | | v | ✓ | String | - | Column for the vector component in the y-direction |
Output Ports9.3.9 - Radar ChartVisualize data in a Radar Chart Home > Visualization > Scientific | Property | Requirement | Type | Default | Description |
|---|
| Name Column | ✓ | String | - | Column containing entity names for each radar | | Value Columns | ✓ | List | - | Columns containing numeric values for radar chart axes | | Fill Opacity | ✓ | Double | 0.5 | Opacity value for radar chart fill from 0.0 (transparent) to 1.0 (opaque) |
Output Ports9.3.10 - Radar PlotView the result in a radar plot. Home > Visualization > Scientific | Property | Requirement | Type | Default | Description |
|---|
| Axes | ✓ | List | - | Numeric columns to use as radar axes | | Trace Name Column | | String | No Selection | Optional - Select a column to use for naming each radar trace | | Trace Color Column | | String | No Selection | Optional - Select a column to use for coloring each radar trace (note: if there are too many traces with distinct coloring values, colors may repeat) | | Line Pattern | ✓ | solid, dash, dot | solid | Pattern of the lines connecting points on the radar plot | | Max Normalize | ✓ | Boolean | true | Normalize radar plot values by scaling them relative to the maximum value on their respective axes | | Fill Trace | ✓ | Boolean | true | Fill the area within each radar trace | | Show Point Markers | ✓ | Boolean | true | Display point markers on the radar plot | | Show Legend | | Boolean | true | Display the legend (note: without the legend, you are unable to selectively hide or show traces in the plot) |
Output Ports9.3.11 - Ternary ContourShows how a measured value changes across all mixtures of three components that sum to a constant Home > Visualization > Scientific | Property | Requirement | Type | Default | Description |
|---|
| Variable 1 | ✓ | String | - | First variable data field | | Variable 2 | ✓ | String | - | Second variable data field | | Variable 3 | ✓ | String | - | Third variable data field | | Measured Value | ✓ | String | - | Measured value data field |
Output Ports9.3.12 - Ternary PlotPoints are graphed on a Ternary Plot using 3 specified data fields Home > Visualization > Scientific | Property | Requirement | Type | Default | Description |
|---|
| Variable 1 | ✓ | String | - | First variable data field | | Variable 2 | ✓ | String | - | Second variable data field | | Variable 3 | ✓ | String | - | Third variable data field | | Categorize by Color | | Boolean | false | Optionally color points using a data field | | Color Data Field | | String | - | Specify the data field to color |
Output Ports9.3.13 - Volcano PlotDisplays statistical significance versus effect size Home > Visualization > Scientific | Property | Requirement | Type | Default | Description |
|---|
| Effect Size (log2 Fold Change) | ✓ | String | - | Select the column representing the effect size or magnitude of change between two experimental groups. This value is typically a log2 fold change and is used for the x-axis of the volcano plot | | P-Value Column | ✓ | String | - | Select the column representing the p-value associated with the statistical test for each feature. This value is transformed using -log10(p-value) and plotted on the y-axis to indicate statistical significance |
Output Ports9.3.14 - Wind Rose ChartDisplays wind distribution using a polar bar chart Home > Visualization > Scientific | Property | Requirement | Type | Default | Description |
|---|
| Radial Values (r) | ✓ | String | - | Numeric values representing magnitude (e.g., frequency) | | Angular Values (θ) | ✓ | String | - | Direction or angle categories (e.g., N, NE, E) | | Color Group | | String | - | Optional grouping column (e.g., wind strength) |
Output Ports9.4 - FinancialOperators in the Financial category Home > Visualization > Financial Operators| Operator | Description |
|---|
| Bullet Chart | Visualize data using a Bullet Chart that shows a primary quantitative bar and delta indicator. Optional elements such as qualitative ranges (steps) and a performance threshold are displayed only when provided. | | Candlestick Chart | Visualize data in a Candlestick Chart | | Funnel Plot | Visualize data in a Funnel Plot | | Gauge Chart | Visualize a single value with a radial gauge chart, showing progress towards a goal with optional steps, threshold, and delta. | | Waterfall Chart | Visualize data as a waterfall chart |
Total: 5 operators 9.4.1 - Bullet ChartVisualize data using a Bullet Chart that shows a primary quantitative bar and delta indicator. Optional elements such as qualitative ranges (steps) and a performance threshold are displayed only when provided. Home > Visualization > Financial | Property | Requirement | Type | Default | Description |
|---|
| Value | ✓ | String | - | The actual value to display on the bullet chart | | Delta Reference | ✓ | String | - | The reference value for the delta indicator. e.g., 100 | | Threshold Value | | String | - | The performance threshold value. e.g., 100 | | Steps | | List | [] | Optional: Each step includes a start and end value e.g., 0, 100 | | ↳ Start | | String | - | | | ↳ End | | String | - | |
Output Ports9.4.2 - Candlestick ChartVisualize data in a Candlestick Chart Home > Visualization > Financial | Property | Requirement | Type | Default | Description |
|---|
| Date Column | ✓ | String | - | The date of the candlestick | | Opening Price Column | ✓ | String | - | The opening price of the candlestick | | Highest Price Column | ✓ | String | - | The highest price of the candlestick | | Lowest Price Column | ✓ | String | - | The lowest price of the candlestick | | Closing Price Column | ✓ | String | - | The closing price of the candlestick |
Output Ports9.4.3 - Funnel PlotVisualize data in a Funnel Plot Home > Visualization > Financial | Property | Requirement | Type | Default | Description |
|---|
| X Column | ✓ | String | - | Data column for the x-axis | | Y Column | ✓ | String | - | Data column for the y-axis | | Color Column | | String | - | Column to categorically colorize funnel sections |
Output Ports9.4.4 - Gauge ChartVisualize a single value with a radial gauge chart, showing progress towards a goal with optional steps, threshold, and delta. Home > Visualization > Financial | Property | Requirement | Type | Default | Description |
|---|
| Gauge Value | ✓ | String | - | The primary value displayed on the gauge chart | | Delta | | String | - | The baseline value used to calculate the delta from the gauge value | | Threshold Value | | String | - | Defines a boundary or target value shown on the gauge chart | | Steps | | List | - | List of step ranges for the gauge | | ↳ Start | | String | - | | | ↳ End | | String | - | |
Output Ports9.4.5 - Waterfall ChartVisualize data as a waterfall chart Home > Visualization > Financial | Property | Requirement | Type | Default | Description |
|---|
| X Axis Values | ✓ | String | - | The column representing categories or stages | | Y Axis Values | ✓ | String | - | The column representing numeric values for each stage |
Output Ports9.5 - MediaOperators in the Media category Home > Visualization > Media OperatorsTotal: 4 operators 9.5.1 - HTML VisualizerRender the result of HTML content Home > Visualization > Media | Property | Requirement | Type | Default | Description |
|---|
| HTML content | ✓ | String | - | |
Output Ports9.5.2 - Image VisualizerVisualize image content Home > Visualization > Media | Property | Requirement | Type | Default | Description |
|---|
| image content column | ✓ | String | - | The Binary data of the Image |
Output Ports9.5.3 - URL VisualizerRender the content of URL Home > Visualization > Media | Property | Requirement | Type | Default | Description |
|---|
| URL content | ✓ | String | - | |
Output Ports9.5.4 - Word CloudGenerate word cloud for texts Home > Visualization > Media | Property | Requirement | Type | Default | Description |
|---|
| Text column | ✓ | String | - | | | Number of most frequent words | | Integer | 100 | |
Output Ports9.6 - AdvancedOperators in the Advanced category Home > Visualization > Advanced Operators| Operator | Description |
|---|
| Choropleth Map | Visualize data using a Choropleth Map that uses shades of colors to show differences in properties or quantities between regions | | Scatter3D Chart | Visualize data in a Scatter3D Plot |
Total: 2 operators 9.6.1 - Choropleth MapVisualize data using a Choropleth Map that uses shades of colors to show differences in properties or quantities between regions Home > Visualization > Advanced | Property | Requirement | Type | Default | Description |
|---|
| Locations Column | ✓ | String | - | Column used to describe location. Currently only supports countries and needs to be three-letter ISO country code | | Color Column | ✓ | String (integer, long, double) | - | Column used to determine intensity of color of the region |
Output Ports9.6.2 - Scatter3D ChartVisualize data in a Scatter3D Plot Home > Visualization > Advanced | Property | Requirement | Type | Default | Description |
|---|
| X Column | ✓ | String | - | Data column for the x-axis | | Y Column | ✓ | String | - | Data column for the y-axis | | Z Column | ✓ | String | - | Data column for the z-axis |
Output Ports9.7 - Nested TableVisualize Data in a Depth Two Nested Table Home > Visualization | Property | Requirement | Type | Default | Description |
|---|
| Add Attribute | ✓ | List | - | List of columns to include in the nested table chart and their subgroup | | ↳ Attribute group | ✓ | String | - | | | ↳ Original attribute Name | ✓ | String | - | | | ↳ New Attribute Name | | String | - | |
Output Ports10 - Control BlockOperators in the Control Block category Home > Control Block Operators| Operator | Description |
|---|
| If | If | | Sleep | Sleep n seconds between each tuple |
Total: 2 operators 10.1 - IfIf Home > Control Block | Property | Requirement | Type | Default | Description |
|---|
| Condition State | ✓ | String | - | Name of the state variable to evaluate |
Output Ports10.2 - SleepSleep n seconds between each tuple Home > Control Block | Property | Requirement | Type | Default | Description |
|---|
| Sleep Time (seconds) | ✓ | Integer | 0 | |
Output Ports11 - Output Port ModesReference for operator output port modes Home Texera operators emit data through output ports. Each port advertises a mode that describes how downstream operators should interpret the stream of tuples it produces. Set SnapshotThe port re-emits the complete result set on each update. Downstream operators always see the full materialized result. Delta UpdatesThe port emits an incremental delta of the result set on each update. Downstream operators apply the delta on top of prior state instead of receiving a re-materialized snapshot. Single SnapshotThe port emits exactly one snapshot for the entire execution (not per update). Used for visualization operators whose output may exceed the memory limit, making repeated full-snapshot emission impractical. 12 - Parameter ReferenceComplete reference for machine learning operator parameters ← Home Available Parameter Sets12.1 - SklearnAdvancedKNN ParametersHyperparameters accepted by SklearnAdvancedKNN ← Parameters Index Used ByThis parameter set is used by the following operators: Parameters| Parameter | Type |
|---|
| n_neighbors | int | | p | int | | weights | str | | algorithm | str | | leaf_size | int | | metric | int | | metric_params | str |
12.2 - SklearnAdvancedSVC ParametersHyperparameters accepted by SklearnAdvancedSVC ← Parameters Index Used ByThis parameter set is used by the following operators: Parameters| Parameter | Type |
|---|
| C | float | | kernel | str | | gamma | float | | degree | int | | coef0 | float | | tol | float | | probability | (lambda value: value.lower() == "true") |
12.3 - SklearnAdvancedSVR ParametersHyperparameters accepted by SklearnAdvancedSVR ← Parameters Index Used ByThis parameter set is used by the following operators: Parameters| Parameter | Type |
|---|
| C | float | | kernel | str | | gamma | float | | degree | int | | coef0 | float | | tol | float | | shrinking | (lambda value: value.lower() == "true") | | verbose | (lambda value: value.lower() == "true") | | epsilon | float | | cache_size | int | | max_iter | int |
|