How to fix failures like CloneUserRepository and DestroyContainer in Airflow? - github

Its my first week of using Airflow. I am trying to schedule snowflake sql codes which are sitting in github. Code is basically refreshing the table with new records (runs locally in snowflake just fine). But the Airflow DAG is failing for these 2 tasks: CloseUserRepository and DestroyContainer. I have checked the airflow documentation here but couldn't find anything specifically on it.
Below is the graphview of the DAG I am trying to run. When I look into the log details for failed CloneUserRepository task, it looks like DAG is not able to access github repository as I have kept it as private. My understanding was: if I am using organizational github and airflow, my creds should act as a bridge and let airflow access the github comfortably. But I am not an expert and hence looking for a help here. I certainly don't want to make it public even for testing my hypothesis.
Any expert comment/suggestion much appreciated. Thank you.

Just checked with an airflow SME in my company. That was indeed the problem. And it worked fine after making the repository public (within the enterprise).

Related

airflow (2.1.1) LDAP instruction was wrong

I was following https://airflow.apache.org/docs/apache-airflow/1.10.1/security.html?highlight=ldap instruction on setting up LDAP with latest airflow image(2.1.1) with docker composer.
This section totally had no impact on login (I still only could login with default airflow/airflow),
[webserver]
authenticate = True
auth_backend = airflow.contrib.auth.backends.ldap_auth
Instead, I found a youtube with instructure as https://www.notion.so/Airflow-with-LDAP-in-10-mins-cbcbe5690d3648f48ee7e8ca45cb755f#e1239b1bda91489b87e4e1bc12f733a7 worked well. Can someone help explain why airflow LDAP instruction did not work?
One reason is that you used documentation for Airflow 1.10 for Airflow 2. This is pretty wrong and you simply used wrong documentation for different version.
But there is another reason: because it gives you the opportunity to improve it. If you know instructions that works, you could submit it as PR to Airflow. It is very easy, you go to the page of Airflow documentation which you want to improve, you click "suggest improvement on this Page" and you can submit your doc improvement immediately. Airflow has more than 1700 contributors like you - it is community managed, free software. So submitting a doc improvement seems like the least you can do to pay back for the free software you use.
In short - feel free to improve Airflow documentation at any point in time.

CloudRun Suddenly got `Improper path /cloudsql/{SQL_CONNECTION_NAME} to connect to Postgres Cloud SQL instance "{SQL_CONNECTION_NAME}"`

We have been running a service using NestJS and TypeORM on fully managed CloudRun without issues for several months. Yesterday PM we started getting Improper path /cloudsql/{SQL_CONNECTION_NAME} to connect to Postgres Cloud SQL instance "{SQL_CONNECTION_NAME}" errors in our logs.
We didn't make any server/SQL changes around this timestamp. Currently there is no impact to the service so we are not sure if this is a serious issue.
This error is not from our code, and our third party modules shouldn't know if we use Cloud SQL, so I have no idea where this errors come from.
My assumption is Cloud SQL Proxy or any SQL client used in Cloud Run is making this error. We use --add-cloudsql-instances flag when deploying with "gcloud run deploy" CLI command.
Link to the issue here
This log was recently added in the Cloud Run data path to provide more context for debugging CloudSQL connectivity issues. However, the original logic was overly aggressive, emitting this message even for properly working CloudSQL connections. Your application is working correctly and should not receive this warning.
Thank you for reporting this issue. The fix is ready and should roll out soon. You should not see this message anymore after the fix is out.

Accessing streamsets web UI on another node in a cluster than where installed, which file system does it 'look in'?

I have a cluster of machines hosting hadoop (MapR) and have install streamsets on one of the nodes (say node002) following the RPM documentation. However, I am accessing the web UI for the data collector from another node, node001.
My question is, when I specify files paths (eg. an origin directory), which file system is the web UI going to be referring to? Eg. if I put an origin directory as /home/myuser/mydata, will the pipeline created in the web UI be looking for that directory in node001 or node002? New to using streamsets, so a more detailed answer would be appreciated. Thanks.
** Ultimately I am asking this because I am currently getting "FileNotFound" and "permission denied" errors while trying to follow the documentation's tutorial and am trying to debug the situation.
From the streamsets community forums: It will be the path to the local file on the machine running that particular SDC instance.
The FileNotFound and permission errors have to do with the fact that the default user for the sdc service is a user called sdc. Still working on how to fix this part, but can produce a workable prototype by setting the read and write access for the directories in question to allow public access (still need to work on this part, but this answers the posted question).

chef mongodb user_management (create admin and other users)

I'm relatively new to chef and am in the process of using the edelight mongodb cookbook. I've got the process of actually creating a standalong mongodb instance working fine. It's understanding how to use the subsequent user_management recipe to create the initial admin user and regular users.
When I add "default['mongodb']['config']['auth'] = true" to the attributes/default.rb file, and run the mongodb::default recipe, the db is created and authentication is on.
However when I run the mongodb::user_management recipe I get this error every time. Clearly I'm doing something wrong, but being new to editing chef/ruby files I can't determine what's failing. Looks like I might need to work within the users.rb attribute file?
===================================================
Error executing action add on resource 'mongodb_user[admin]'
NameError
uninitialized constant Mongo::MongoClient
The edelight cookbook has been unmaintained for quite some time now. Chef-Brigade is attempting to take over maintenance on the cookbook until a new owner can be found.
https://github.com/chef-brigade/mongodb-cookbook
There is work being implemented to fix some of the user_management issues. I am not 100% sure the current state of the user_management fixes but you would likely be better off starting with that cookbook and reporting any issues to the team there so they can work to resolve. There is active development taking place.
I would be glad to help you debug the issue if it persists on the chef-brigade flavor of the cookbook as we can actively make changes to resolve any issues.

MongoDB replica set in Azure "Waiting for role to start... Calling OnRoleStart()"

I have a problem trying to implement a mongodb replica set as a worker role instance in Windows Azure. In the Windows Azure portal, one of the instances is shown as busy with the status:
Waiting for role to start... Calling OnRoleStart()
I have checked all the settings and everything seems to be ok, what could the problem be?
Denis Markelov's blog post helped me solve this problem. The solution is mainly his, however I had to take an extra step to get it to work and thought others might find it useful.
Solution from blog:
Windows Azure reuses virtual machines for roles, so after a fresh
deployment on a hard drive you can find files that were created during
previous sessions. If MongoDB was terminated improperly - there might
be a lock file ("persisted mutex" analogue), because of which MongoDB
refuses to start. It is located at the drive with a label
"WindowsAzureDrive" (say it is F:), at the path:
F:\data\mongod.lock
In the case of a production use this situation might require a
recovery procedures, but if you are just in the process of initial
setup - it is safe to remove this file, letting MongoDB to start
again.
I was having this problem and did as suggested, however I was still having the same problem. So I took a look at the log file at
C:\Resources\Directory\.MongoDB.WindowsAzure.MongoDBRole.MongodLogDir\mongod.txt
And saw that another file was also giving an error. In order to fix the problem, you also have to delete the file local.ns in the same directory as mongod.lock.