Inverted exclamation mark gets added in the output written by Google cloud storage - streamsets

Thank you for the support in advance.
Using streamset pipeline, I am trying to load the MSSQL CDC data using SQL Server CDC Client origin and load into destinations in Google cloud storage and local FS.
While the local FS writes as expected, the file in Google Cloud storages has inverted exclamation mark beginning of the file.
I could not figure out why it happens. Any idea?
ยก{"header":{"stageCreator":"SQLServerCDCClient_01","sourceId":"cdc.HumanResources_Shift_CT::___$seqval=0000005500002F380008::___$operation=1::___$start_lsn=0000005500002F38000E","stagesPath":"SQLServerCDCClient_01","trackingId":"cdc.HumanResources_Shift_CT::___$seqval=0000005500002F380008::___$operation=1::___$start_lsn=0000005500002F38000E::SQLServerCDCClient_01","previousTrackingId":null,"raw":null,"rawMimeType":null,"errorDataCollectorId":null,"errorPipelineName":null,"errorStage":null,"errorStageLabel":null,"errorCode":null,"errorMessage":null,"errorTimestamp":0,"errorStackTrace":null,"errorJobId":null,"values":{"sdc.operation.type":"2","jdbc.__$seqval.jdbcType":"-2","jdbc.__$start_lsn":"0000005500002F38000E","jdbc.__$operation.jdbcType":"4","jdbc.__$update_mask.jdbcType":"-3","jdbc.cdc.source_name":"Shift","jdbc.tables":"HumanResources_Shift_CT","jdbc.__$seqval":"0000005500002F380008","jdbc.__$update_mask":"1F","jdbc.ShiftID.jdbcType":"-6","jdbc.__$operation":"1","jdbc.StartTime.jdbcType":"92","jdbc.ModifiedDate.jdbcType":"93","jdbc.cdc.source_schema_name":"HumanResources","jdbc.Name.jdbcType":"-9","jdbc.EndTime.jdbcType":"92","jdbc.__$start_lsn.jdbcType":"-2"}},"value":{"type":"LIST_MAP","value":[{"type":"SHORT","value":15,"sqpath":"/ShiftID","dqpath":"/ShiftID"},{"type":"STRING","value":"Full Day Shift","sqpath":"/Name","dqpath":"/Name"},{"type":"TIME","value":38836977,"sqpath":"/StartTime","dqpath":"/StartTime"},{"type":"TIME","value":67636977,"sqpath":"/EndTime","dqpath":"/EndTime"},{"type":"DATETIME","value":1625132836977,"sqpath":"/ModifiedDate","dqpath":"/ModifiedDate"}],"sqpath":"","dqpath":""}}
Configuration in the Google cloud storage stage,

Related

Firestore: Copy/Import data from local emulator to Cloud DB

I need to copy Firestore DB data from my local Firebase emulator to the cloud instance. I can move data from the Cloud DB to the local DB fine, using the EXPORT functionality in the Firebase admin console. We have been working on the local database instance for 3-4 months and now we need to move it back to the Cloud. I have tried to move the local "--export-on-exit" files back to my storage bucket and then IMPORT from there to the Cloud DB, but it fails everytime.
I have seen one comment by Doug at https://stackoverflow.com/a/71819566/20390759 that this is not possible, but the best solution is to write a program to copy from local to cloud. I've started working on that, but I can't find a way to have both projects/databases open at the same time, both local and cloud. They are all the same project ID, app-key, etc.
Attempted IMPORT: I copied the files created by "--export-on-exit" in the emulator to my cloud storage bucket. Then, I selected IMPORT and chose the file I copied up to the bucket. I get this error message:
Google Cloud Storage file does not exist: /xxxxxx.appspot.com/2022-12-05 from local/2022-12-05 from local.overall_export_metadata
So, I renamed the file metadata file to match the directory name from the local system, and the IMPORT claims to initiate successfully, but then fails with no error message.
I've been using Firestore for several years, but we just started using the emulator this year. Overall I like it, but if I can't easily move data back to the Cloud, I can't use it. Thanks for any ideas/help.

how to do local and remote file storage in a flutter app

I'm doing a flutter app that needs to open a binary file, display the content to a user and allow them to edit and save. File size would be between 10K and 10 MB. The file also needs to be in the cloud for sharing and accessing from other devices. To minimise remote network egress data charges and also local user mobile data charges, I'm envisaging that when the user saves the file it would be saved locally rather than written to the cloud and only written to the cloud when the user closes the file or maybe at regular intervals or no activity. To minimise network data charges I would like to keep a permanent local copy and the remote copy of the file would have a small supporting file that identified by who and when the remote file was last written. When the app starts, it checks if its local copy is up to date by reading the supporting file. The data does not need high security.
The app will run on Android, IOS, the web and preferably on the desktop - though I know that google firebase SDK for Windows is incomplete/ unavailable.
Is google firebase cloud storage the easiest and best way to do this. If not what is the easiest way.
Are there any cloud storage providers that don't charge for network egress data, just for storage.

How can you integrate grafana with Google Cloud SQL

I haven't been able to find how to take a Postgres instance on Google Cloud SQL (on GCP) and hook it up to a grafana dashboard to visualize the data that is in the DB. Is there an accepted easy way to do this? I'm a complete newbie to grafana and have limited experience with GCP(used cloud sql proxy to connect to a postgres instance)
Grafana display the data. Google Cloud Monitoring store the data to display. So, you have to make a link between both.
And boom, magically, a plug-in exists!
Note: when you know what you search, it's easier to find it. Understand your architecture to reach the next level!

The performance of Azure PostgreSql is bad

I am using Azure PostgreSQL, in my web project, I save files(such as images,.csv) in table as bytea data type.
Now I can get files from Azure PostgreSQL successfully, but the performance is not so good. it will take more than 20s to retrieve files if I request for multiple files, even sometimes I got timeout error from my web server. So is there anyone could give me some advice to solve this? Any suggestion is appreciated.
As #eshirvana said, saving files in databases is not a good idea. As we know,the performance that querying blobs or files in SQL is awful and it also greatly cost DB server memory. Saving URLs in DB only is recommended.
If you are using Azure Cloud, you can save your files in Azure blob storage. And you can save your file storage blob URL in your DB table.
Your client could get the file URL from the webserver by SQL query and access files in Azure storage by the file URL. This way could ease the IO and memory pressure of your web server and greatly improve the query performance.
What's more, creating your SQL service in the region that you are in or near could also reduce network latency. You can see all the Azure datacenter regions here.

Use Cygnus to store historical data from Orion ContextBroker in a local Hadoop database

We are currently working in a project where we use Orion ContextBroker to store information from different sensors and Wirecloud to show them in a web page.
We want to store historical data from these sensors in order to show them in a graph. I have looked around the Fiware documentation and they recommend to store the data in a Cosmos instance of Fi-lab, through Cygnus.
The thing is that we would like to store that historical data in a local Hadoop based server we have in our company, not in Cosmos, because we are running this project in a local net where we don't have internet access, and also to have that information stored in our local server.
Is it possible to configure Cygnus to redirect the output data to my file system? If so, which files must be configured in order to achieve this?
Thank you
The answer is yes. Cygnus is meant to persist context data in whatever HDFS-based filesystem (as the one used by Cosmos), thus nothing special has to be done when configuring Cygnus.
If you download the lastest version (0.7.0 at the moment of writting this), you will need to configure:
A cygnus_instance_default.conf file from cygnus_instance.conf.template. This is the instance configuration. From 0.7.1 is possible to have multiple instance configurations that are run in a parallel way, and they all have to called cygnus_instance_<whatever>.conf.
A agent.conf file from agent.conf.template. This is the Flume specific configuration that you will find in the README.md.