Macros in Datafusion using Argument setter - google-groups

Using Argument setter by supplying the parameter value I want to make the Datafusion pipeline as resuable. As said by many other answer's have tried implementing using the cloud reusable pipeline example given in Google guide.I was not able to pass the parameter Json file.So how to create the API to that parameter Json file stored in Google storage.Please explain the values to be passed to Argument setter like URL,Request response etc., If any one of you had implemented in your projects.
Thank you.

ArgumentSetter plugin reads from a HTTP endpoint and it must be publicly accessible as is depicted within the GCP documentation. Currently, there is not a way to read from a non-public file stored in GCS. This behavior has been reported in order to be improved to CDAP through this ticket.

Can you please provide what you've tried so far and where you're stuck?
The URL field in argument setter would contain the API endpoint you're making a call to. Make sure you include any headers your call would need like Authorization, Accept etc.
If you're having issues with argument setter a good check is to use Curl or any other tool to make sure you're able to talk to the endpoint you're trying to use.
Here's some documentation about Argument setter: https://github.com/data-integrations/argument-setter

Define a JSON file with appropriate name/value pairs. Upload it in a GCS bucket - make it public by changing permissions (add "allUsers" in permissions list). When you save it, the file will say "Public to Internet"
Copy the https path to the file and use it in Arguments Setter. If you're able to access this path from curl/ your browser, Argument Setter will be able to do too..
There are other problems I've encountered while using Argument Setter though - the pipe doesn't supersede runtime arguments over default values provided in the URL many a times, specially when the pipe is duplicated.
To make file public

You have to make your bucket public, currently there is no other way.
gsutil iam ch allUsers:objectViewer gs://BUCKET_NAME

Related

What is the best practice to design the rest api url if one resource identifier is a path

It is straightforward to put resource id into url if it is a int or long type. e.g.
GET files/123
But my problem is that my resource identifier is a path. e.g. /folder_1/folder_2/a.sh because the underlying implementation is a filesystem. So I can not put it as part of rest api url because it is conflict with url path.
Here's approaches what I can think of:
Put the path id as the request param. e.g.
GET files?path=/folder_1/folder_2/a.sh
Encode/decode the path to make it qualifier as part of url.
Introduce another int/long id for this resource in backend. And map it to the path. The int/long type resource id is stored in database. And I need to maintain the mapping for each CURD operation.
I am not sure whether approach 1 is restful, approach 2 needs extra encoding/decoding, and approach 3 needs extra work to maintain the mapping.
I wonder what is the best practice to design the rest api url for this kind of case.
Simple:
#GET
#Path("/files/{path:.+}")
#Produces({MediaType.TEXT_PLAIN})
public String files(
#PathParam("path") String path
) {
return path;
}
When you query files/test1/tes2 via url output is:
test1/tes2
Just put the path after a prefix, for example:
GET /files/folder_1/folder_2/a.sh
There isn't a conflict, since when the request path starts with your known prefix (/files/, in the above example), you know that the rest should be parsed as the path to the file, including any slashes.
Well, my experience designing "restful" APIs shows that you have to take into consideration future extensions of your API.
So, the guidelines work best when followed closely when it makes sense.
In your specific example, the path of the file is more of an attribute of the file, that can also serve as its unique ID.
From your API client's perspective, /files/123 would make perfect sense, but /files/dir1/file2.txt is debatable.
A query parameter here would probably help more, much like what you would do if you wanted to retrieve a filtered list of files, rather than the whole collection.
On the other hand, using a query parameter would also help for future extensions, since supporting /files/{path} would also mean conflicts when attempting to add sub-resources to your files endpoint.
For example, let's assume that you might need in the future another endpoint /files/attributes. But, having such an endpoint, would exclude any possibility for your clients to match a file named attributes.

How to split openAPI/Swagger file into multiple valid sub-files?

Our service implements different levels of access and we are using one openAPI YAML file internally.
For external documentation purposes, we would like to create multiple openAPI files, that are valid in themselves (self-sustained), but only have a partial set of the global file, e.g. based on the path or on tags.
(The same path may be used in different split-Files but I don't think that is a problem then.)
Any idea on how to achieve that? Is there some tooling around for it?
You can use a valid URI in a JSON Pointer which points to another resource. The URI can be a path to a local file, a web resource, etc.:
paths:
/user/{id}:
summary: Get a user
parameters:
- $ref: "./path/to/file#/user_id"
# And so on...
Reserved keys in the OpenAPI spec must be unique so I don't think you'd be able to create standalone OpenAPI specs without some third-party utility that could overcome that limitation.
However, you would be able to create valid standalone JSON objects defined across many files and reference them in the index document. There are many articles online providing examples:
https://davidgarcia.dev/posts/how-to-split-open-api-spec-into-multiple-files/
https://blog.pchudzik.com/202004/open-api-and-external-ref/
I ended up writing a Python script, that I have posted here.
Flow
Read the YAML File into a dictionary
Copy the dictionary to a new dictionary
Iterate through the original dictionary and
Remove items that are not tagged with the tag(s) you want to keep
Remove items that are have some keyword you want to omit in the path
Write out the dictionary to a new YAML
The GIST is available here:
https://gist.github.com/erikm30/d1f7e1cea3f18ece207ccdcf9f12354e

Can you use AzureDevops build variables within another variable in an Azure Devops Library Group?

I'm trying to construct a variable by inserting values from another variable environment variable, a bit like a template.
Similar to this example from Octopus...
The ConnectionString variable is a template which uses Server and Database variables.
Above the examples are because Octopus is using different values per environment. In my case I'd like to keep the template as an unprotected variable so I can see it and have the inserted variable protected because it contains sensitive information.
I've tried using macro syntax ($(Server)) and runtime expression syntax ($[Server]), neither of which seem to replace the values at build time.
Expression syntax ${{Server}} gives me an error "bad substitution" which implies that there's a good substitution but I'm missing something.
This is not supported. We could not use dynamic password when use a service endpoint. If you want to change the password, you need update the endpoint directly.
Besides service endpoint is independent, should not dependent on other variable or variable group.
No need to use a protected SECRET_REPLACED_AT_BUILD variable with password to protect it. Password in the service endpoint is also protected.
You could refer the format of Environment Variables with Credential Provider in our official link.
VSS_NUGET_EXTERNAL_FEED_ENDPOINTS: Json that contains an array of service endpoints, usernames and access tokens to authenticate
endpoints in nuget.config. Example:
{"endpointCredentials": [{"endpoint":"http://example.index.json", "username":"optional", "password":"accesstoken"}]}

Manipulating path mapping in AWS API gateway integration

I would like to modify an url parameter /resource/{VaRiAbLe} in an API gateway to S3 mapping so that it actually points to /my-bucket/{variable}. That is, it accepts mixed-case input, and maps it to a lower-case name. Mapping path variables is relatively simple enough to S3 integrations, but I can't seem to get a lower-case mapping working.
Reading through the documentation for mapping parameters, it looks like the path parameters are simple string values (and not templated values), and so defining a mapping as method.request.path.variable.toLowerCase() won't work.
Does anyone have any ideas how to implement this mapping?
Map path variables to a JSON body, and then call another API method that actually does the S3 call?
Bite the bullet, and implement a Lambda function to do the S3 get for me?
Find another api method for S3 that accepts a JSON body that I can use to get the data?
Update using Orchestrated calls
Following the info from Jack, I figured I should try doing the orchestrated call, since the traffic volume is low enough that I am sure that I won't be able to keep the lambda hot.
As a proof of concept, I added two methods to my resource (sitting at /resource/{variable} - GET and POST. The GET method chains to the POST, which does the actual retrieving of the data.
POST method configuration
This is a vanilla S3 proxying method, where you set the URL Path parameter for {variable} to be method.request.body.variable.
GET method configuration
This is a HTTPS proxying method. You'll need an URL for the POST method, so you'll need to deploy the API to get the URL. The only other configuration needed here is a body mapping template with content like:
{
"variable" : "$input.params('variable').toLowerCase()",
"something" : "$input.params('something')"
}
This should be enough to get this working.
The downside to this looks to be that I'm adding an extra method (POST) to my API for that resource that could confuse consumers of the API. I think it should be possible to make the POST on the /resource resource, which would at least make a bit more sense from an API design standpoint.
Depending on how frequently this API will be called, I'd either go with the Lambda proxy or chaining two API Gateway methods together. If the API is called frequently enough to keep a Lambda function warm (say once a minute), then go with Lambda. If not, go with the orchestrated API call.
The orchestrated API call would be interesting, I'd be happy to help with that if you have questions.
As far as I know the only S3 API for getting object data is the GET that is documented in their API reference.

JAX-RS and unknown query parameters

I have a Java client that calls a RESTEasy (JAX-RS) Java server. It is possible that some of my users may have a newer version of the client than the server.
That client may call a resource on the server that contains query parameters that the server does not know about. Is it possible to detect this on the server side and return an error?
I understand that if the client calls a URL that has not been implemented yet on the server, the client will get a 404 error, but what happens if the client passes in a query parameter that is not implemented (e.g.: ?sort_by=last_name)?
Is it possible to detect this on the server side and return an error?
Yes, you can do it. I think the easiest way is to use #Context UriInfo. You can obtain all query parameters by calling getQueryParameters() method. So you know if there are any unknown parameters and you can return error.
but what happens if the client passes in a query parameter that is not implemented
If you implement no special support of handling "unknown" parameters, the resource will be called and the parameter will be silently ignored.
Personally I think that it's better to ignore the unknown parameters. If you just ignore them, it may help to make the API backward compatible.
You should definitely check out the JAX-RS filters (org.apache.cxf.jaxrs.ext.RequestHandler) to intercept, validate, manipulate request, e.g. for security or validatng query parameters.
If you declared all your parameters using annotations you can parse the web.xml file for the resource class names (see possible regex below) and use the full qualified class names to access the declared annotations for methods (like javax.ws.rs.GET) and method parameters (like javax.ws.rs.QueryParam) to scan all available web service resources - this way you don't have to manually add all resource classes to your filter.
Store this information in static variables so you just have to parse this stuff the first time you hit your filter.
In your filter you can access the org.apache.cxf.message.Message for the incoming request. The query string is easy to access - if you also want to validate form parameters and multipart names, you have to reas the message content and write it back to the message (this gets a bit nasty since you have to deal with multipart boundaries etc).
To 'index' the resources I just take the HTTP method and append the path (which is then used as key to access the declared parameters.
You can use the ServletContext to read the web.xml file. For extracting the resource classes this regex might be helpful
String webxml = readInputStreamAsString(context.getResourceAsStream("WEB-INF/web.xml"));
Pattern serviceClassesPattern = Pattern.compile("<param-name>jaxrs.serviceClasses</param-name>.*?<param-value>(.*?)</param-value>", Pattern.DOTALL | Pattern.MULTILINE);