Replace datetime value in GCS json file using sed - sed
I need to replace the datetime value in a json file stored in GCS with date:
e.g. datetimestamp value "2020-04-18 10:09:09.433000" should be replaced with 2020-04-18 strip out the timestamp part.
I have tried the following:
gsutil cp gs://bucket/cloudsql_to_bigquery/Accounts_test - \
| sed -e's/[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]*//g' \
| gsutil cp - gs://bucket/cloudsql_to_bigquery/Accounts_test
But I keep ending up with an application/octet file type instead of JSON not sure why also my sed isn't working as expected. Any clue how I can fix this?
Sample json file:
{"_etblBudgets_Checksum": null, "_etblBudgets_dCreatedDate": null, "_etblBudgets_dModifiedDate": "2020-03-20 06:37:29.630000", "_etblBudgets_iBranchID": null, "_etblBudgets_iChangeSetID": 1, "_etblBudgets_iCreatedAgentID": null, "_etblBudgets_iCreatedBranchID": null, "_etblBudgets_iModifiedAgentID": 0, "_etblBudgets_iModifiedBranchID": 0, "dBudgetDTStamp": null, "fBudget": null, "fBudgetForeign": 0.0, "fForecast": 0.0, "fForecastForeign": 0.0, "fUnprocessedPOValue": -1768.0100000000002, "fUnprocessedPOValueForeign": 0.0, "iBudgetAccountID": 137, "iBudgetAccountType": 36, "iBudgetPeriodID": 12, "iBudgetProjectID": 7, "iBudgetTxBranchID": 0, "idBudgets": 1}
{"_etblBudgets_Checksum": null, "_etblBudgets_dCreatedDate": null, "_etblBudgets_dModifiedDate": "2020-03-20 06:37:29.630000", "_etblBudgets_iBranchID": null, "_etblBudgets_iChangeSetID": 1, "_etblBudgets_iCreatedAgentID": null, "_etblBudgets_iCreatedBranchID": null, "_etblBudgets_iModifiedAgentID": 0, "_etblBudgets_iModifiedBranchID": 0, "dBudgetDTStamp": null, "fBudget": null, "fBudgetForeign": 0.0, "fForecast": 0.0, "fForecastForeign": 0.0, "fUnprocessedPOValue": -19238.68, "fUnprocessedPOValueForeign": 0.0, "iBudgetAccountID": 138, "iBudgetAccountType": 36, "iBudgetPeriodID": 12, "iBudgetProjectID": 7, "iBudgetTxBranchID": 0, "idBudgets": 2}
{"_etblBudgets_Checksum": null, "_etblBudgets_dCreatedDate": null, "_etblBudgets_dModifiedDate": "2020-03-20 06:37:29.630000", "_etblBudgets_iBranchID": null, "_etblBudgets_iChangeSetID": 1, "_etblBudgets_iCreatedAgentID": null, "_etblBudgets_iCreatedBranchID": null, "_etblBudgets_iModifiedAgentID": 0, "_etblBudgets_iModifiedBranchID": 0, "dBudgetDTStamp": null, "fBudget": null, "fBudgetForeign": 0.0, "fForecast": 0.0, "fForecastForeign": 0.0, "fUnprocessedPOValue": -38647.87, "fUnprocessedPOValueForeign": 0.0, "iBudgetAccountID": 138, "iBudgetAccountType": 36, "iBudgetPeriodID": 12, "iBudgetProjectID": 8, "iBudgetTxBranchID": 0, "idBudgets": 3}
{"_etblBudgets_Checksum": null, "_etblBudgets_dCreatedDate": null, "_etblBudgets_dModifiedDate": "2020-03-20 06:37:29.630000", "_etblBudgets_iBranchID": null, "_etblBudgets_iChangeSetID": 1, "_etblBudgets_iCreatedAgentID": null, "_etblBudgets_iCreatedBranchID": null, "_etblBudgets_iModifiedAgentID": 0, "_etblBudgets_iModifiedBranchID": 0, "dBudgetDTStamp": null, "fBudget": null, "fBudgetForeign": 0.0, "fForecast": 0.0, "fForecastForeign": 0.0, "fUnprocessedPOValue": 0.0, "fUnprocessedPOValueForeign": 0.0, "iBudgetAccountID": 138, "iBudgetAccountType": 36, "iBudgetPeriodID": 12, "iBudgetProjectID": 9, "iBudgetTxBranchID": 0, "idBudgets": 4}
Expected output:
{"_etblBudgets_Checksum": null, "_etblBudgets_dCreatedDate": null, "_etblBudgets_dModifiedDate": "2020-03-20", "_etblBudgets_iBranchID": null, "_etblBudgets_iChangeSetID": 1, "_etblBudgets_iCreatedAgentID": null, "_etblBudgets_iCreatedBranchID": null, "_etblBudgets_iModifiedAgentID": 0, "_etblBudgets_iModifiedBranchID": 0, "dBudgetDTStamp": null, "fBudget": null, "fBudgetForeign": 0.0, "fForecast": 0.0, "fForecastForeign": 0.0, "fUnprocessedPOValue": -1768.0100000000002, "fUnprocessedPOValueForeign": 0.0, "iBudgetAccountID": 137, "iBudgetAccountType": 36, "iBudgetPeriodID": 12, "iBudgetProjectID": 7, "iBudgetTxBranchID": 0, "idBudgets": 1}
{"_etblBudgets_Checksum": null, "_etblBudgets_dCreatedDate": null, "_etblBudgets_dModifiedDate": "2020-03-20", "_etblBudgets_iBranchID": null, "_etblBudgets_iChangeSetID": 1, "_etblBudgets_iCreatedAgentID": null, "_etblBudgets_iCreatedBranchID": null, "_etblBudgets_iModifiedAgentID": 0, "_etblBudgets_iModifiedBranchID": 0, "dBudgetDTStamp": null, "fBudget": null, "fBudgetForeign": 0.0, "fForecast": 0.0, "fForecastForeign": 0.0, "fUnprocessedPOValue": -19238.68, "fUnprocessedPOValueForeign": 0.0, "iBudgetAccountID": 138, "iBudgetAccountType": 36, "iBudgetPeriodID": 12, "iBudgetProjectID": 7, "iBudgetTxBranchID": 0, "idBudgets": 2}
{"_etblBudgets_Checksum": null, "_etblBudgets_dCreatedDate": null, "_etblBudgets_dModifiedDate": "2020-03-20", "_etblBudgets_iBranchID": null, "_etblBudgets_iChangeSetID": 1, "_etblBudgets_iCreatedAgentID": null, "_etblBudgets_iCreatedBranchID": null, "_etblBudgets_iModifiedAgentID": 0, "_etblBudgets_iModifiedBranchID": 0, "dBudgetDTStamp": null, "fBudget": null, "fBudgetForeign": 0.0, "fForecast": 0.0, "fForecastForeign": 0.0, "fUnprocessedPOValue": -38647.87, "fUnprocessedPOValueForeign": 0.0, "iBudgetAccountID": 138, "iBudgetAccountType": 36, "iBudgetPeriodID": 12, "iBudgetProjectID": 8, "iBudgetTxBranchID": 0, "idBudgets": 3}
{"_etblBudgets_Checksum": null, "_etblBudgets_dCreatedDate": null, "_etblBudgets_dModifiedDate": "2020-03-20", "_etblBudgets_iBranchID": null, "_etblBudgets_iChangeSetID": 1, "_etblBudgets_iCreatedAgentID": null, "_etblBudgets_iCreatedBranchID": null, "_etblBudgets_iModifiedAgentID": 0, "_etblBudgets_iModifiedBranchID": 0, "dBudgetDTStamp": null, "fBudget": null, "fBudgetForeign": 0.0, "fForecast": 0.0, "fForecastForeign": 0.0, "fUnprocessedPOValue": 0.0, "fUnprocessedPOValueForeign": 0.0, "iBudgetAccountID": 138, "iBudgetAccountType": 36, "iBudgetPeriodID": 12, "iBudgetProjectID": 9, "iBudgetTxBranchID": 0, "idBudgets": 4}
A few suggestions:
I recommend making the match regex as specific as possible, to avoid inadvertently matching something you didn't mean if the file ever gets data you didn't expect. Here's a more specific sed expression that would do it:
sed -E 's/([0-9]{4}-[0-9]{2}-[0-9]{2}) .*/\1/'
You can test that your sed expression works locally before trying to download/upload with it, e.g., running this pipeline from the shell:
echo 2020-04-18 10:09:09.433000 | sed -E 's/([0-9]{4}-[0-9]{2}-[0-9]{2}) .*/\1/'
I recommend against doing streaming uploads/downloads (gsutil cp with "-" as a source or dest argument) because that will not use checksum validation of the data (which could result in data corruption). So, use something like:
gsutil cp gs://bucket/cloudsql_to_bigquery/Accounts_test /tmp/Accounts_test
sed -E 's/([0-9]{4}-[0-9]{2}-[0-9]{2}) .*/\1/' /tmp/Accounts_test > /tmp/Accounts_test_truncated_timestamps
gsutil cp /tmp/Accounts_test_truncated_timestamps gs://bucket/cloudsql_to_bigquery/Accounts_test
rm /tmp/Accounts_test_truncated_timestamps
Related
Flutter I can't read an image with readasbytessync with local file
i do not understand why the image directly from image_picker work but not the image load from app directory, i thought it was a rights issue (I have the right to read or record), but the images come from the same folder. I don't know if I'm clear, but has anyone had the same problem? Code: print(image); final fileBytes = image.readAsBytesSync(); print(fileBytes); Return of the just taken image: File: '/data/user/0/com.wallis.env_wallis/app_flutter/1.jpg' [255, 216, 255, 225, 1, 149, 69, 120, 105, 102, 0, 0, 77, 77, 0, 42, 0, 0, 0, 8, 0, 10, 1, 59, 0, 2, 0, 0, 0, 22, 0, 0, 0, 134, 1, 0, 0, 4, 0, 0, 0, 1, 0, 0, 2, 128, 1, 16, 0, 2, 0, 0, 0, 6, 0, 0, 0, 156, 1, 1, 0, 4, 0, 0, 0, 1, 0, 0, 1, 224, 1, 15, 0, 2, 0, 0, 0, 7, 0, 0, 0, 162, 1, 14, 0, 2, 0, 0, 0, 1, 0, 0, 0, 0, 135, 105, 0, 4, 0, 0, 0, 1, 0, 0, 0, 189, 1, 18, 0, 3, 0, 0, 0, 1, 0, 6, 0, 0, 1, 50, 0, 2, 0, 0, 0, 20, 0, 0, 0, 169, 136, 37, 0, 4, 0, 0, 0, 1, 0, 0, 1, 47, 0, 0, 0, 0, 50, 52, 46, 68, 69, 83, 84, 82, 85, 67, 84, 73, 79, 78, 32, 70, 79, 82, 69, 84, 83, 0, 77, 105, 32, 57, 84, 0, 88, 105, 97, 111, 109, 105, 0, 50, 48, 50, 49, 58, 49, 48, 58, 48, 51, 32, 48, 57, 58, 52, 50, 58, 51, 57, 0, 0, 7, 164, 3, 0, 3, 0, 0, 0, 1, 0, 0, 0, 0, 136, 39, 0, 3, 0, 0, 0, 1, 76, 165, 0, 0, 146, 10, 0, 5, 0, 0, 0, 1, 0, 0, 1, 23, 130, 154, 0, 5, 0, 0, 0, 1, 0, 0, 1, 31, 146, 9, 0, 3, 0, 0, 0, 1, 0, 16, 0, 0, 146, 8, 0, 4, 0, 0, 0, 1, 0, 0, 0, 0, 130, 157, 0, 5, 0, 0, 0, 1, 0, 0, 1, 39, 0, 0, 0, 0, 0, 0, 18, 162, Return of the image taken in the folder : File: '/data/user/0/com.wallis.env_wallis/app_flutter/485.jpg' []
the exifs seem to have trouble being saved after renaming the image several times, when I sent a renamed image it was fine but when I renamed the image again that was the problem. So I shifted the time when I rename the images to fix the problem.
flutter: File download from server
I am downloading file from server but I don't know what happened so, kindly tell me how to solve this issue? await Dio().post("https://test.blockchainhuissieray.com/api/download_file.php", data: { "jwt" : token, "fileID" : id, "directory" : widget.folderName }, options: Options( contentType: ContentType.parse("application/json") )).then((res) => res.data).then((data) async{ var intList = data.toString().codeUnits; print('File : $intList'); print(data); var filePath = await ImagePickerSaver.saveFile(fileData: Uint8List.fromList(intList)); print(filePath); var savedFile = File.fromUri(Uri.file(filePath)); setState(() { _imageFile = Future<File>.sync(() => savedFile); images.add(File(data)); imagel = File(data); }); }).catchError((err){print(err);}); print(images); setState(() { showDialog( context: context, builder: (context){ return AlertDialog( content: Center( child: Image.file(imagel, fit: BoxFit.cover,), ), ); } ); }); I/flutter ( 6918): File : [65533, 65533, 65533, 65533, 2, 118, 69, 120, 105, 102, 0, 0, 77, 77, 0, 42, 0, 0, 0, 8, 0, 8, 1, 16, 0, 2, 0, 0, 0, 26, 0, 0, 0, 110, 1, 0, 0, 4, 0, 0, 0, 1, 0, 0, 3, 65533, 1, 1, 0, 4, 0, 0, 0, 1, 0, 0, 5, 0, 1, 50, 0, 2, 0, 0, 0, 20, 0, 0, 0, 65533, 1, 18, 0, 3, 0, 0, 0, 1, 0, 1, 0, 0, 65533, 105, 0, 4, 0, 0, 0, 1, 0, 0, 0, 65533, 65533, 37, 0, 4, 0, 0, 0, 1, 0, 0, 1, 65533, 1, 15, 0, 2, 0, 0, 0, 7, 0, 0, 0, 65533, 0, 0, 0, 0, 65, 110, 100, 114, 111, 105, 100, 32, 83, 68, 75, 32, 98, 117, 105, 108, 116, 32, 102, 111, 114, 32, 120, 56, 54, 0, 50, 48, 49, 57, 58, 48, 53, 58, 49, 55, 32, 49, 48, 58, 52, 54, 58, 48, 53, 0, 71, 111, 111, 103, 108, 101, 0, 0, 16, 65533, 65533, 0, 5, 0, 0, 0, 1, 0, 0, 1, 105, 65533, 65533, 0, 5, 0, 0, 0, 1, 0, 0, 1, 113, 65533, 65533, 0, 2, 0, 0, 0, 4, 52, 50, 54, 0, 65533, 65533, 0, 2, 0, 0, 0, 4, 52, 50, 54, 0, 65533, 65533, 0, 2, 0, 0, 0, 4, 52, 50, 54, 0, 65533, 10, 0, 5, 0, 0, 0, 1, 0, 0, 1, 121, 65533, 9, 0, 3, 0, 0, 0, 1, 0, 0, 0, 0, 65533, 39, 0, 3, 0, 0, 0, 1, 0, I/flutter ( 6918): ����vExif V/MediaStore( 6918): Create the thumbnail in memory: origId=266, kind=1, isVideo=false D/skia ( 6918): --- Failed to create image decoder with message 'unimplemented' I/chatty ( 6918): uid=10087(io.hexasoft.bch) identical 1 line D/skia ( 6918): --- Failed to create image decoder with message 'unimplemented' I/flutter ( 6918): saved filePath: I/flutter ( 6918): I/flutter ( 6918): [File: '����vExif
You may want to use the path_provider package instead of the ImagePicker plugin. Then write those raw bytes to a file in the temporary directory. Use the file path you generated from there and display the image that way. Or, if you don't need to save the image to the local storage you could display the image with the raw bytearray by using Image.memory instead of Image.file
groupby and join vs window in pyspark
I have a data frame in pyspark which has hundreds of millions of rows (here is a dummy sample of it): import datetime import pyspark.sql.functions as F from pyspark.sql import Window,Row from pyspark.sql.functions import col from pyspark.sql.functions import month, mean,sum,year,avg from pyspark.sql.functions import concat_ws,to_date,unix_timestamp,datediff,lit from pyspark.sql.functions import when,min,max,desc,row_number,col dg = sqlContext.createDataFrame(sc.parallelize([ Row(cycle_dt=datetime.datetime(1984, 5, 2, 0, 0), network_id=4,norm_strength=0.5, spend_active_ind=1,net_spending_amt=0,cust_xref_id=10), Row(cycle_dt=datetime.datetime(1984, 6, 2, 0, 0), network_id=4,norm_strength=0.5, spend_active_ind=1,net_spending_amt=2,cust_xref_id=11), Row(cycle_dt=datetime.datetime(1984, 7, 2, 0, 0), network_id=4,norm_strength=0.5, spend_active_ind=1,net_spending_amt=2,cust_xref_id=12), Row(cycle_dt=datetime.datetime(1984, 4, 2, 0, 0), network_id=4,norm_strength=0.5, spend_active_ind=1,net_spending_amt=2,cust_xref_id=13), Row(cycle_dt=datetime.datetime(1983,11, 5, 0, 0), network_id=1,norm_strength=0.5, spend_active_ind=0,net_spending_amt=8,cust_xref_id=1 ), Row(cycle_dt=datetime.datetime(1983,12, 2, 0, 0), network_id=1,norm_strength=0.5, spend_active_ind=0,net_spending_amt=2,cust_xref_id=1 ), Row(cycle_dt=datetime.datetime(1984, 1, 3, 0, 0), network_id=1,norm_strength=0.5, spend_active_ind=1,net_spending_amt=15,cust_xref_id=1 ), Row(cycle_dt=datetime.datetime(1984, 3, 2, 0, 0), network_id=1,norm_strength=0.5, spend_active_ind=0,net_spending_amt=7,cust_xref_id=1 ), Row(cycle_dt=datetime.datetime(1984, 4, 3, 0, 0), network_id=1,norm_strength=0.5, spend_active_ind=0,net_spending_amt=1,cust_xref_id=1 ), Row(cycle_dt=datetime.datetime(1984, 5, 2, 0, 0), network_id=1,norm_strength=0.5, spend_active_ind=0,net_spending_amt=1,cust_xref_id=1 ), Row(cycle_dt=datetime.datetime(1984,10, 6, 0, 0), network_id=1,norm_strength=0.5, spend_active_ind=1,net_spending_amt=10,cust_xref_id=1 ), Row(cycle_dt=datetime.datetime(1984, 1, 7, 0, 0), network_id=1,norm_strength=0.4, spend_active_ind=0,net_spending_amt=8,cust_xref_id=2 ), Row(cycle_dt=datetime.datetime(1984, 1, 2, 0, 0), network_id=1,norm_strength=0.4, spend_active_ind=0,net_spending_amt=3,cust_xref_id=2 ), Row(cycle_dt=datetime.datetime(1984, 2, 7, 0, 0), network_id=1,norm_strength=0.4, spend_active_ind=1,net_spending_amt=5,cust_xref_id=2 ), Row(cycle_dt=datetime.datetime(1985, 2, 7, 0, 0), network_id=1,norm_strength=0.3, spend_active_ind=1,net_spending_amt=8,cust_xref_id=3 ), Row(cycle_dt=datetime.datetime(1985, 3, 7, 0, 0), network_id=1,norm_strength=0.3, spend_active_ind=0,net_spending_amt=2,cust_xref_id=3 ), Row(cycle_dt=datetime.datetime(1985, 4, 7, 0, 0), network_id=1,norm_strength=0.3, spend_active_ind=1,net_spending_amt=1,cust_xref_id=3 ), Row(cycle_dt=datetime.datetime(1985, 4, 8, 0, 0), network_id=1,norm_strength=0.3, spend_active_ind=1,net_spending_amt=9,cust_xref_id=3 ), Row(cycle_dt=datetime.datetime(1984, 4, 2, 0, 0), network_id=2,norm_strength=0.5, spend_active_ind=0,net_spending_amt=3,cust_xref_id=4 ), Row(cycle_dt=datetime.datetime(1984, 4, 3, 0, 0), network_id=2,norm_strength=0.5, spend_active_ind=0,net_spending_amt=2,cust_xref_id=4 ), Row(cycle_dt=datetime.datetime(1984, 1, 2, 0, 0), network_id=2,norm_strength=0.5, spend_active_ind=0,net_spending_amt=5,cust_xref_id=4 ), Row(cycle_dt=datetime.datetime(1984, 1, 3, 0, 0), network_id=2,norm_strength=0.5, spend_active_ind=1,net_spending_amt=6,cust_xref_id=4 ), Row(cycle_dt=datetime.datetime(1984, 3, 2, 0, 0), network_id=2,norm_strength=0.5, spend_active_ind=0,net_spending_amt=2,cust_xref_id=4 ), Row(cycle_dt=datetime.datetime(1984, 1, 5, 0, 0), network_id=2,norm_strength=0.5, spend_active_ind=0,net_spending_amt=9,cust_xref_id=4 ), Row(cycle_dt=datetime.datetime(1984, 1, 6, 0, 0), network_id=2,norm_strength=0.5, spend_active_ind=1,net_spending_amt=1,cust_xref_id=4 ), Row(cycle_dt=datetime.datetime(1984, 1, 7, 0, 0), network_id=2,norm_strength=0.4, spend_active_ind=0,net_spending_amt=7,cust_xref_id=5 ), Row(cycle_dt=datetime.datetime(1984, 1, 2, 0, 0), network_id=2,norm_strength=0.4, spend_active_ind=0,net_spending_amt=8,cust_xref_id=5 ), Row(cycle_dt=datetime.datetime(1984, 2, 7, 0, 0), network_id=2,norm_strength=0.4, spend_active_ind=1,net_spending_amt=3,cust_xref_id=5 ), Row(cycle_dt=datetime.datetime(1985, 2, 7, 0, 0), network_id=2,norm_strength=0.6, spend_active_ind=1,net_spending_amt=6,cust_xref_id=6 ), Row(cycle_dt=datetime.datetime(1985, 3, 7, 0, 0), network_id=2,norm_strength=0.6, spend_active_ind=0,net_spending_amt=9,cust_xref_id=6 ), Row(cycle_dt=datetime.datetime(1985, 4, 7, 0, 0), network_id=2,norm_strength=0.6, spend_active_ind=1,net_spending_amt=4,cust_xref_id=6 ), Row(cycle_dt=datetime.datetime(1985, 4, 8, 0, 0), network_id=2,norm_strength=0.6, spend_active_ind=1,net_spending_amt=6,cust_xref_id=6 ), Row(cycle_dt=datetime.datetime(1984, 4, 2, 0, 0), network_id=3,norm_strength=0.5, spend_active_ind=0,net_spending_amt=0,cust_xref_id=7 ), Row(cycle_dt=datetime.datetime(1984, 4, 3, 0, 0), network_id=3,norm_strength=0.5, spend_active_ind=0,net_spending_amt=0,cust_xref_id=7 ), Row(cycle_dt=datetime.datetime(1984, 1, 2, 0, 0), network_id=3,norm_strength=0.5, spend_active_ind=0,net_spending_amt=0,cust_xref_id=7 ), Row(cycle_dt=datetime.datetime(1984, 1, 3, 0, 0), network_id=3,norm_strength=0.5, spend_active_ind=0,net_spending_amt=0,cust_xref_id=7 ), Row(cycle_dt=datetime.datetime(1984, 3, 2, 0, 0), network_id=3,norm_strength=0.5, spend_active_ind=0,net_spending_amt=0,cust_xref_id=7 ), Row(cycle_dt=datetime.datetime(1984, 1, 5, 0, 0), network_id=3,norm_strength=0.5, spend_active_ind=0,net_spending_amt=0,cust_xref_id=7 ), Row(cycle_dt=datetime.datetime(1984, 1, 6, 0, 0), network_id=3,norm_strength=0.5, spend_active_ind=0,net_spending_amt=0,cust_xref_id=7 ), Row(cycle_dt=datetime.datetime(1984, 1, 7, 0, 0), network_id=3,norm_strength=0.4, spend_active_ind=0,net_spending_amt=3,cust_xref_id=8 ), Row(cycle_dt=datetime.datetime(1984, 1, 2, 0, 0), network_id=3,norm_strength=0.4, spend_active_ind=0,net_spending_amt=2,cust_xref_id=8 ), Row(cycle_dt=datetime.datetime(1984, 2, 7, 0, 0), network_id=3,norm_strength=0.4, spend_active_ind=1,net_spending_amt=8,cust_xref_id=8 ), Row(cycle_dt=datetime.datetime(1985, 2, 7, 0, 0), network_id=3,norm_strength=0.6, spend_active_ind=1,net_spending_amt=4,cust_xref_id=9 ), Row(cycle_dt=datetime.datetime(1985, 3, 7, 0, 0), network_id=3,norm_strength=0.6, spend_active_ind=0,net_spending_amt=1,cust_xref_id=9 ), Row(cycle_dt=datetime.datetime(1985, 4, 7, 0, 0), network_id=3,norm_strength=0.6, spend_active_ind=1,net_spending_amt=9,cust_xref_id=9 ), Row(cycle_dt=datetime.datetime(1985, 4, 8, 0, 0), network_id=3,norm_strength=0.6, spend_active_ind=0,net_spending_amt=3,cust_xref_id=9 ) ])) I am trying to sumspend_active_ind for each cust_xref_id and keep those with sum more than zero. One way to do this is using grouby and join: dg1 = dg.groupby("cust_xref_id").agg(sum("spend_active_ind").alias("sum_spend_active_ind")) dg1 = dg1.filter(dg1.sum_spend_active_ind != 0).select("cust_xref_id") dg = dg.alias("t1").join(dg1.alias("t2"),col("t1.cust_xref_id")==col("t2.cust_xref_id")).select(col("t1.*")) The other way I can think of it is using window: w = Window.partitionBy ('cust_xref_id') dg = dg.withColumn('sum_spend_active_ind',sum(dg.spend_active_ind).over(w)) dg = dg.filter(dg.sum_spend_active_ind!=0) which one of these methods (or any other method) is more efficient for what I am trying to do. Thanks
You could try to open your spark-ui at localhost:4040, or see the query plan using the explain method: ( dg .groupby('cust_xref_id') .agg(F.sum('spend_active_ind').alias('sum_spend_active_ind')) .filter(F.col('sum_spend_active_ind') > 0) ).explain()
Receiving Sysex messages with audiokit
I have an app which is sending controller settings to a hardware synthesizer using sysex. In other words: such a syses messages selects a parameter from the synth, and sets its value. With audiokit this is pretty simple. Such a message looks like this: [240, 00, 32, 51, 1, 16, 112, 00, 40, 95, 247] Which sets parameter 40 (in parameter group 112) to 95 00, 32, 51, 1 defines the synth model, other the part number and channel. Now I try to build the opposite: the synth sends its parameters and values to the app. I do receive such sysex messages in the new versions of audiokit with the receivedMIDISystemCommand(_ data: [MIDIByte]) function. Example with midi connection over bluetooth (when using Yamaha MD-BT01 ): [240, 0, 32, 111, 64, 89, 0, 64, 87, 64, 192, 239, 91, 21, 191, 1, 0, 0, 1, 0, 247] And something different when using wired MIDI to USB (Roland UM-ONE) Using my wired Roland UM-ONE i receive different messages. they all look like this: [240, 0, 32, 15, 223, 170, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 240, 0, 32, 15, 223, 170, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 51, 1, 16, 15, 223, 170, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 51, 1, 16, 15, 223, 170, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 111, 64, 120, 15, 223, 170, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 111, 64, 120, 15, 223, 170, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 247] most messages contain 21, 17 or 13. A lot of messages only contain one number, which is 247 The first two messages contain a lot of numbers, maybe about 30, with mostly zero's The Parameters should have values between 0 and 127 when I send them towards the synthesizer (this are integers in my app, converted to midi bytes just before they are sent: midiByte(127) ). But the numbers I receive are higher numbers, as shown in the example above. I think I need to convert these numbers in some way, but how? What type of numbers am I looking at? Can somebody point me towards some possibilities?
SciPy: create a number sequence generator from constraints
I have been reading Probabilistic Programming and Bayesian Methods for Hackers and I'm hopeful that with PyMC3 I can create a number sequence generator. Here are 34 examples of the kind of number sequences I want to generate: (20, [0, 0, 0, 4, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (15, [0, 0, 0, 4, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (9, [0, 0, 0, 4, 4, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (20, [0, 0, 0, 6, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (16, [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (17, [0, 0, 0, 3, 4, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 4, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (14, [0, 0, 0, 1, 0, 0, 0, 3, 4, 0, 2, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (13, [0, 0, 0, 5, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (17, [0, 0, 0, 3, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 3, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (22, [0, 0, 0, 4, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (19, [0, 0, 0, 4, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (20, [0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 4, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (15, [0, 0, 0, 5, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (20, [0, 0, 0, 6, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 0, 3, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (13, [0, 0, 0, 1, 2, 2, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 1, 4, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (20, [0, 0, 0, 5, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 2, 5, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (11, [0, 0, 0, 6, 3, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (21, [0, 0, 0, 5, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (20, [0, 0, 0, 5, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 3, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (21, [0, 0, 0, 5, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (22, [0, 0, 0, 3, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (23, [0, 0, 0, 4, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (20, [0, 0, 0, 4, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 2, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (24, [0, 0, 0, 1, 4, 6, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]) (22, [0, 0, 0, 6, 1, 0, 0, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 2, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 1]) (23, [0, 0, 0, 5, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (22, [0, 0, 0, 2, 7, 0, 0, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 2, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 2, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (22, [0, 0, 0, 6, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (24, [0, 0, 0, 4, 5, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (23, [0, 0, 0, 6, 5, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 2, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (22, [0, 0, 0, 5, 3, 0, 3, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 2, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (24, [0, 0, 0, 5, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (23, [0, 0, 0, 1, 0, 0, 0, 0, 6, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 2, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) (21, [0, 0, 0, 4, 5, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) I'm trying to create a number sequence generator that, given the first number in the tuple (like 22), it produces a list of numbers that look like the second: [0, 0, 0, 6, 1, 0, 0, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 2, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 1] I understand the constraints (the sum of the list equals the first number, some positions in the list are always zero, etc.) and I realize that I can create something like this: import pymc as pm tau = pm.DiscreteUniform("tau", lower=0, upper=22) and then do something like this: tau.random() But I can't quite get how to construct the constraints and get it to output a list of numbers.
I'm not sure PyMC3 is the right tool here, unless you want to infer something about some hyperparameters -- supposing you store the count data you posted in a 34 x 89 numpy array, then the following suffices: from scipy.stats import multinomial import numpy as np data = np.array([[0, 0, 0, 4, ...], ..., ]) p = data.sum(axis=1) / data.sum() row_count = 22 number_of_samples = 10 samples = multinomial(row_count, p=p).rvs(number_of_samples)