Streamsets code behind

Streamsets code behind - streamsets

I am interested to work on Streamsets. However, I would like to integrate into my codes not working on UI. How they have been written, Can I access the codes behind Directory and file tail. If they are using Spark streaming behind or other technologies?

Here are the steps for starting SDC in UI development mode.
When starting SDC, pass flag -Dsdc.static-web.dir=/<SDC_SOURCE_CODE_DIR>/datacollector/datacollector-ui/target/dist, like this:
export SDC_JAVA_OPTS="-Dsdc.static-web.dir=/Users/madhu/Documents/projects/datacollector/datacollector-ui/target/dist"
bin/streamsets dc -verbose
Go to folder <SDC_SOURCE_CODE_DIR>/datacollector/datacollector-ui/ and run below command for live reload
grunt watch --force
With above steps, you will be able to modify UI source files directly in folder <SDC_SOURCE_CODE_DIR>/datacollector/datacollector-ui/src and changes will be reflected in browser by just refreshing it, no need to build anything.

Related

Why is GlueStudio not seeing same script as legacy jobs view in AWS console?

When I update my script in S3 (Script path) and open Glue Studio, I do not see the updates for this script (pyspark). But if I open the legacy jobs, it has the latest code. What is also odd, if I look at Job details IN glue studio, I see the updated --additional-python-modules. Our code is deployed through Ansible Tower. One other thing is if you are in GlueStudio and you "Run" the script, it will run the latest code. So issue is with the editor not having latest code.

This had to do something with security. I had completely logged off and back on, clearing all caches and was able to see updates in studio. Seems like a bug but past this issue.

Is there a way to reset a UWP app using powershell?

I see that there are way to Get Get-AppxPackage or Remove Remove-AppxPackage UWP app from windows 10 using PowerShell.
I am wondering if there is a way to reset a UWP app? I require it for automated testing, I would rather reset the UWP app than uninstall and install again, as that would slow down the testing.

If you're looking to clear your package's ApplicationData then you want
appdata = Windows.Management.Core.ApplicationDataManager.CreateForPackageFamily(pkgfamilyname)
appdata.ClearAsync()
See MSDN for ApplicationDataManager.CreateForPackageFamily() and .ClearAsync()
Powershell has no 'await' affordance last time I looked (though it's been a while) so not so viable. If you're looking to muck with applicationdata you may find APPDATA.EXE handy. You can probably guess why I wrote it... :P For instance
APPDATA.EXE Clear foo.bar_1234567890abc
If you're looking to reset a package to its initially installed state, then no, there is no API other than uninstall + install
Remove-AppxPackage foo.bar_1.2.3.4_x86__1234567890abc
Add-AppxPackage foobar.msix
Settings' Reset option for an installed package essentially does that just slightly more efficiently. You're still going thru the full deregister and uninstall the package and then install and register it for the user so it may not be instantaneous. But that's the only way to truly reset a package to its initial state. Windows has various forms of user data associated with a package (ApplicationData, IndexDB, more) as well as system state cribbed and wired up when a package is installed for a user (what is the package, where's it live, that the user has it, that it's status is not tampered or otherwise unhealthy, more). The only way to truly 'reset' that to the initial state is a full remove then add.
If you just need to wipe appdata then .ClearAsync is the ticket.

I am not aware of any command that would do that for you (except for the UI available in Settings app). However, you may write a PowerShell script that would clear up the application data files in the app's folder (this is not an official solution, but seems to work based on my trials).
Go to
C:\Users\{your_username}\AppData\Local\Packages\
And find your app's folder there. There are several folders containing the application state.
The nuclear option is to just delete all the folders. Then the app will crash once on startup, then the system will automatically reset it and restore the folders.
The less invasive option I have now tried seems to be to keep the folders and just delete their content, except for the AC folder, which seems to be system protected. When I tried this with my app, it launched successfully without crashing and system recreated the state files anew on its own.
The rest of application files lives in C:\Program Files\WindowsApps, but those are just application DLLs and content files and are read-only, so they should not affect the app state at all.
You may want to perform additional deletion if you use a shared publisher folder.

For windows 10 build 2004+ there seems to be a powershell command Reset-AppxPackage now.
UPDATE
My mistake, it is available from version 20175 onward. So, should be available in 20H2.

Visual Studio Online / Azure stopping and starting web applications using Powershell

I'm using Visual Studio Online's build tools to deploy web applications from a single solution. I've occasionally been running into file locking issues.
Error: Web Deploy cannot modify the file 'Microsoft.CodeAnalysis.CSharp.dll' on the destination because it is locked by an external process.
After some Googling, I believe the "fix" is to stop the web applications before deployment on Azure and start it back up after. Sounds legit.
However, there does not seem to be a straight forward way to do this directly on VSO's build definitions. I've created an "Azure Powershell" build task, but it wants a PS1 file from the repository. It doesn't seem to let me just run Azure Powershell commands (e.g. Stop-AzureWebsite) from here. My team has created a work-around where we have a "run.ps1" that just executes the command you pass as a parameter, but none of us are satisfied by that.
What are we missing? There has got to be an easier way to do this without having a PS1 script checked into source control.

I solved this by installing Azure App Services - Start and Stop extension from Visual Studio Marketplace.
When installed, it will allow you to wrap the Deploy Website to Azure task in your Release definition with Azure AppServices Stop and Azure AppServices Start tasks, effectively eliminating the lock issues.

Check if you are using "/" on the "Web Deploy Package" path for folder separators instead of "\".
i.e. change
$(System.DefaultWorkingDirectory)/My Project/drop/MyFolder/MyFile.zip
for
$(System.DefaultWorkingDirectory)\My Project\drop\MyFolder\MyFile.zip
I noticed that was the only difference between the one I was getting the error and the others (the Restart step I added was not helping). Once I modified the path, I got it working.
Sounds crappy, but fixed my issue.

Did you use the Build Deployment Template that sets the correct msbuild parameters for you for your package? You can see how here. I would create a build using that template and see if you have the same issues. If so ping me on Twitter #DonovanBrown and I will see if I can figure what is going on.

As a rule it is good practice to have any scripts or commands required to deploy your software to be checked into source control as part of your build. They can then be easily run repeatedly with little configuration at the build level. This provides consistency and transparency.
Even better is to have deployment scripts output as part of the build and use a Release Management tool to control the actual deployment.
Regardless having configuration as code is a mantra that all Dev and Ops teams should live by.

Custom Action not being fired

Recently, I was assigned the task to create a deployment package for an application which btw, I'm totally new at. So far, so good.. Now there is a requirement to extract files from a zip file which will be bundled with the setup file. So, I had to write custom actions in the 'Commit' section of the Installer class. I added the Installer class in a new project of type 'Class Library' under the same solution. I wrote the code after 'base.Commit(savedState)'.
I tried showing MessageBox at the event entry point, used Debugger.Launch(), Debugger.Break() but somehow, no matter what I do, it seems that the custom action is not willing to be hit at all and the application just installs itself. I searched a lot of sites and blogs but no help so far.
I've assigned my installer class (SampleApp.exe, in my case) to all the Custom Action's modes (Install, Commit, Rollback and Uninstall) in the Deployment project. Any help.
P.S. I'm using a Visual Studio 2010 setup project.
Thanks, in advance!

You should probably be trying a class library Dll, not an executable (which is typically for something like a service).
You don't need it all the nodes if all you're doing is calling at Commit. And why Commit? Install is just the same in most cases.
If you're not seeing a MessageBox then probably your CA isn't being called, and that may because it's not a class library. Note that your CA is not running in the interactive user context - it's being called from an msiexec process running with the system account, so you must be very explicit about (say) the path to the zip file, and any user profile folders will probably fail because the system account doesn't really have them.

What files are these and where are they going on disk? If they are user profile files you can install the zip files to a per machine location and then have the application itself unzip the files to the desired location on first launch. Unzipping from within your setup is not good practice - it is error prone and bad design.
Using the application allows proper exception handling and interactivity (the user can be informed if something goes wrong). Set some registry flags in HKCU when you have completed the unzipping so it doesn't happen more than once, and perform the unzip once per user.

How to avoid redundancy and time loss when re-building images during development?

As a Vagrant user, when trying Docker I noticed one significant difference between development workflow with Vagrant and with Docker - with Docker I need to rebuild my image every time from scratch, even if I made minor changes in code.
This is major problem for me, because the process of image rebuilding oftenly very redundant and time consuming.
Perhaps there are some smart workflows with Docker already invented, if so, what are they?

I filed a feature-request for the vagrant-cachier plugin for saving docker build data and attached a bash workaround for that process. If it's okay for you to hack yourself around you can implement the scripts in vagrant.
caching docker build data with vagrant
Note that this procedure needs the vagrant-cachier plugin to be installed and has to save and load +300MB files from disk if they are new to the machine. Thus it's really slow if you have dockerfiles with just 1-5 lines of code but it's fast if you have dockerfiles with a lot of LOCs or images that have to be downloaded from the net.
Also note that this approach saves every intermediate building step. So if you are building an image and change a line in the middle of a dockerfile and build again the docker build process will get all cached intermediate containers till the changed line.
Using baseimages is still the preferred way but you can combine both procedures.
Feel free to post improvements and subscribe so fgrehm will maybe implement this in his plugin natively.

As Mark O'Connor suggested, one of the tips may be building a base image to your container(s). This image should have the dependencies, package installation, downloads... or any other consuming activity. This base image should be supposed to be built less frequently than the other one(s). In a similar way, if the final states of the execution of each step of your dockerfile doesn't change, Docker don't build this layer again. Thus, you can trying execute the commands than may change this state almost every run (e.g.: apt-get update) as later as you can, so docker don't have to rebuild the steps before. And also you can try to edit your dockerfiles in the later steps better than in the first.
Another option if you compile/download something inside the container is to have it downloaded or compiled in a host folder, and attach it to the container using -v or --volume option in docker run.
Finally there is other approaches to this issue as the one used by chef with knife container. In this approach you build the container using chef cookbooks, and each time you build it (because you have edited your cookbooks...) these changes are applied as a new docker layer (AUFS layer) and you don't have to repeat all the process. I didn't recommend this solution unless you have experience with Chef and you have cookbooks to manage your software. You should work harder to get it working and if you want Chef only to manage docker containers I think it doesn't worth it (although chef is a great option to manage infrastructures).
To automate the building process in case you have several images dependents itself, you can have a bash script that helps you with that task (credits to smola#github):
#!/bin/bash
IMAGES="${IMAGES:-stratio/base:test stratio/mesos:test stratio/spark-mesos:test stratio/ingestion:test}"
LATEST_TAG="${LATEST_TAG:-test}"
for image in $IMAGES ; do
USER=${image/\/*/}
aux=${image/*\//}
NAME=${aux/:*/}
TAG=${aux/*:/}
DIR=${NAME}/${TAG}
pushd $DIR
docker build --tag=${USER}/${NAME}:${TAG} .
if [[ $TAG = $LATEST_TAG ]] ; then
docker tag ${USER}/${NAME}:${TAG} ${USER}/${NAME}:latest
fi
popd
done

There are a couple of tricks that might better your workflow (very web-focused)
Docker caching
Always make sure you are adding your source to your Docker image in Dockerfile at the very end.
Example;
COPY data/package.json /data/
RUN cd /data && npm install
COPY data/ /data
This will make sure you get optimal caching when building the image, and that Docker doesn't have to rebuild the npm packages when you are changing your source.
Also, make sure you don't have a base-image that adds folders/files that are often changed (like base images doing COPY . /data/
fig mount
Use fig (or another tool), and mount your source directory when developing. This way, you can develop with instant changes and still use the current version of your code when building the image.
development server
You can start your developer web-server when you are developing, and nginx when not (if you are developing an www app, but same idea applies to other apps).
Example, in your startup script, do something like:
if [[ $DEBUG ]]; then
/usr/bin/supervisorctl start gulp
else
/usr/bin/supervisorctl start nginx
fi
And have autostart=false in your supervisord.conf files.
auto-refresh app
If you are developing a web-app, use tools like gulp and eg gulp-connect, if you are developing a python/django app, use the runserver utility. Both reloads the server when detecting changes in the files.
If you are using the if [[ $DEBUG ]] ... trick, make them listen on the same port as your normal instance (nginx). That way, you can have 1 configuration for your reverse proxy, ie, just send the traffic to example www:8080, it will hit your web-page both in production and if you are developing.

Create a based image that holds the bulk of your application's dependencies. This will significantly reduce your docker build times.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse