PutMongoRecord cannot access filename - mongodb

I have a simple flow setup in Nifi:
GetFile picks up CSV files from a directory
PutMongoRecord stores them in a MongoDB collection (using a CSVReader)
I want to put the records into a collection whose name is derived from the filename: ${filename:substringBefore('.csv')}. My problem is that I can't seem to get the PutMongoRecord processor to read the filename. Every time, I get the same error:
com.mongodb.MongoCommandException: Command failed with error 73: 'Invalid namespace specified 'xxx.'' on server localhost:27017. The full response is { "ok" : 0.0, "errmsg" : "Invalid namespace specified 'xxx.'", "code" : 73, "codeName" : "InvalidNamespace" }
If I try hard-coding a collection name, it works. It also works with ${hostname()}. Since the processor is connected to the "success" output of GetFile, why isn't it reading the filename?
NOTE: I have tested this with a LogAttribute processor: a filenameattribute is indeed present. I have tried various other attributes, but none seem to produce anything.

It is a bug till NiFi 1.6.0 and it is recently fixed. Take a look at NIFI-5197. It will be released in NiFi 1.7.0 which, I believe, will be available in a couple of weeks.
If it is an urgent need, write to the dev#nifi.apache.org and it is possible to get the patch for this.

Related

Print editions using metaboss on Solana

I'm trying to create prints from a master edition (aka original edition) using from the console. The number of prints should be limited to a fixed number.
I followed this procedure :
Upload the image to Arweave : arloader upload image.jpg --with-sol --sol-keypair-path ~/.config/solana/id.json --ar-default-keypair --no-bundle.
Create the json file with NFT metadata :
{
"name": "name_of__the_collection",
"symbol": "token_of_the_collection",
"uri": "https://arweave.net/[arweave_img_tx_id]",
"seller_fee_basis_points": 0,
"creators": [
{
"address": "address_of_the_creator_of_the_collection",
"verified": false,
"share": 100
}
]
}
Mint the NFT :
metaboss mint one --keypair ~/.config/solana/id.json --nft-data-file ./metadata.json --max-editions='10'
Create the all the prints :
metaboss mint missing-editions --account address_of_the_creator_of_the_collection
I have two issues :
On solana explorer, I have an error : error loading image
The 4. command returns an error : Error: failed to get account data
What's wrong ?
[edit] Error 1 : I used uri key instead of the image in the metadata. That's why solana explorer couldn't find the image.
Generally the process is good. There are some details that have to be aligned though:
Regarding the missing image:
You have to upload the metadata JSON file, too. This is what you reference in the mint command.
Your metadata is not 100% valid. E.g. you are missing the properties field. Have a look into the Token Metadata docs for more details.
Regarding metaboss mint missing-editions:
The Account you specify with --account should not be the address of the creator of the collection but instead the Master Edition Address. (Master Edition is the NFT you minted in step 3)
Since the command runs a GPA call you should add --timeout 120 and use not use the default RPC. Otherwise you will not get results.
If it still does not work you can also run
metaboss mint editions --next-editions 9
Please let me know in case of any uncertainties.

How to resolve "Invalid Sequence Token" when using cloudwatch agent?

I'm seeing the following warning in the /var/log/amazon/amazon-cloudwatch-agent/amazon-cloudwatch-agent.log:
2021-10-06T06:39:23Z W! [outputs.cloudwatchlogs] Invalid SequenceToken used, will use new token and retry: The given sequenceToken is invalid. The next expected sequenceToken is: 49619410836690261519535138406911035003981074860446093650
But there is no mention about which file is really the one that it's failing. Not even when I add "debug": true to the /opt/aws/amazon-cloudwatch-agent/bin/config.json.
cat /opt/aws/amazon-cloudwatch-agent/bin/config.json|jq .agent
{
"metrics_collection_interval": 60,
"debug": true,
"run_as_user": "root"
}
I have many (28) files in my .logs.logs_collected.files.collect_list section of the config.json file, so how can I find which file is exactly causing trouble?
As of 2021-11-29 a PR to improve the log messages has been merged to the cloudwatch-agent but a new version of the cloudwatch-agent has not been released yet, the next version after v1.247349.0 will likely include a fix for this.
The fix will change the log statements to say
INFO: First time sending logs to %v/%v since startup so sequenceToken is nil, learned new token: xxxx: yyyy: This is an INFO message, as this behaviour is expected at startup for example.
WARN: Invalid SequenceToken used (%v) while sending logs to %v/%v, will use new token and retry: xxxxxv: This on the other hand is not expected and may mean that someone else is writing to the loggroup/logstream concurrently.
If those warnings come right after a restart of the cloudwatch agent (cwagent) then you can safely ignore them, it's expected behaviour . The cloudwatch agent does not save the next sequence token in its persistent state so on restart it will "learn" the correct sequence number by issuing a PutLogEvent with no sequence token at all, that returns an InvalidSequenceTokenException with the next sequence token to use. So it's expected to see those at startup, anyway I proposed a PR to amazon-cloudwatch-agent to improve those log messages.
If the "Invalid SequenceToken used" is seen long after the restart then you may have other issues.
The "Invalid SequenceToken used" error usually means that two entities/sources are trying to write to the same log group/log stream as mentioned in 2 (which is really for the old awslogs agent but still useful):
Caught exception: An error occurred (InvalidSequenceTokenException)
when calling the PutLogEvents operation: The given sequenceToken is
invalid[…] -or- Multiple agents might be sending log events to log
stream[…] – You can't push logs from multiple log files to a single
log stream. Update your configuration to push each log to a log
stream-log group combination.
I could be that the amazon cloudwatch agent itself it's trying to upload the same file twice because you have duplicates in your config.json.
So first print all your log group / log stream pairs in your config.json with:
cat /opt/aws/amazon-cloudwatch-agent/bin/config.json|jq -r '.logs.logs_collected.files.collect_list[]|"\(.log_group_name) \(.log_stream_name)"'|sort
which should give an output similar to:
/tableauserver/apigateway apigateway_node5-0.log
/tableauserver/apigateway control_apigateway_node5-0.log
/tableauserver/appzookeeper appzookeeper-discovery_node5-1.log
...
/tableauserver/vizqlserver vizqlserver_node5-3.log
Then you can use uniq -d to find the duplicates in that list with:
cat /opt/aws/amazon-cloudwatch-agent/bin/config.json|jq -r '.logs.logs_collected.files.collect_list[]|"\(.log_group_name) \(.log_stream_name)"'|sort|uniq -d
# The list should be empty otherwise you have duplicates
If that command produces any output it means that you have duplicates in your config.json collect_list.
I personally think that cwagent itself should print the "offending" loggroup/logstream in the logs so I opened in issue in amazon-cloudwatch-agent GitHub page.

Crowdin error when I tried to upload translations

I've an issue (I'm still blocked), I've created my configuration file like :
project_identifier: test
api_key: KeepTheAPIkeySecret
base_url: https://api.crowdin.com
base_path: /path/to/your/project
files:
-
source: /locale/en/LC_MESSAGES/messages.po
translation: /locale/%two_letters_code%/LC_MESSAGES/%original_file_name%
See : https://github.com/crowdin/crowdin-cli
However, I received an error message when I execute my command line to upload translation in Crowdin :
error: Seems Crowdin server API URL is not valid.
Please check the `base_url` parameter in the configuration file.
I don't know why it's not working!Thanks for any help !
Crowdin sent me another JAR,
The last one was not good for windows path.

Configure MongoDB with Nutch2.3, some error about indexerJob?

I had successfully configure MongoDB(5.3.1) and Nutch(2.3), when I run the command "./bin/nutch index -all" some errors printed after inject/generate/fetch/parse/updatedb commands work,the error details like:
SolrIndexerJob: java.lang.RuntimeException: job failed: name=apache-nutch-2.3.1.jar, jobid=job_local140530148_0001
at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:120)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:154)
at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:176)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:202)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:211)
I had configure the file in $NUTCH_HOME/runtime/local/conf/nutch-site.xml
details:
If all the others steps was running, it would be not a problem with mongodb but with solr (your nutch-site.xml suggests that you wanted index your data in solr). As far that i remember, when i used solr, i precised the core name, it would be something like that :
http://localhost:8983/solr/mycore/

mongod crashed without logging

I'm using mongodb v2.2.2 on single server(Ubuntu 12.04).
It crashed with no log on /var/log/mongodb/mongodb.log.
It seemed crashed during logging.(Character is interrupted. And, this log is normal query log.)
And, I checked on syslog about memory-issue(for example, killed proccess),
but couldn't find it.
Then, I found the following error on mongo-shell(db.printCollectionStats() command).
DLLConnectionResultData
{
"ns" : "UserData.DLLConnectionResultData",
"count" : 8215398,
"size" : 4831306500,
"avgObjSize" : 588.0794211065611,
"errmsg" : "exception: assertion src/mongo/db/database.cpp:300",
"code" : 0,
"ok" : 0
}
How do I figure out problems?
Thank you,
I checked that line in the source code for 2.2.2 (see here for reference). That error is specifically related to enforcing quotas on MongoDB. You haven't mentioned enforcing quotas here or what you have set the files limit to (default is 8) but you could be running into the limit here.
First, I would recommend getting onto a more recent version of 2.2 (and upgrading to 2.4 eventually, but definitely 2.2.7+ initially). If you are using quotas, this fix which went into 2.2.5 would log quota exceeded messages (previously logged only at log level 1, default is log level 0). Hence if a quota violation is the culprit here, you may get an early warning.
If that is the root cause, then you have a couple of options:
After upgrading to the latest version of 2.2, of the issue happens repeatedly, file a bug report for the crash on 2.2
Upgrade to 2.4, verify that the issue still occurs, and file a bug (or add to the above report for 2.2)
In either case, turning off quotas in the interim would be the obvious way to prevent the crash.