XText: first and last character truncated in custom STRING terminals - eclipse

I have redefined the STRING terminal this way
terminal STRING : ('.'|'+'|'('|')'|'a'..'z'|'A'..'Z'|'_'|'0'..'9')*;
because I have to recognize STRING not delimited by " or '
the problem is that, though the generated parser works, it truncates the first and the last character of the recognized string. What am I missing?

If you customize the STRING rule, you'll have to adapt the respective value converter, too.
Something like this has to be bound in your runtime module:
public class MyStringValueConverter extends STRINGValueConverter {
#Override
protected String toEscapedString(String value) {
return value;
}
public String toValue(String string, INode node) {
if (string == null)
return null;
return string;
}
}
See the docs for details.

Related

How to use FileIO.writeDynamic() in Apache Beam 2.6 to write to multiple output paths?

I am using Apache Beam 2.6 to read from a single Kafka topic and write the output to Google Cloud Storage (GCS). Now I want to alter the pipeline so that it is reading multiple topics and writing them out as gs://bucket/topic/...
When reading only a single topic I used TextIO in the last step of my pipeline:
TextIO.write()
.to(
new DateNamedFiles(
String.format("gs://bucket/data%s/", suffix), currentMillisString))
.withWindowedWrites()
.withTempDirectory(
FileBasedSink.convertToFileResourceIfPossible(
String.format("gs://bucket/tmp%s/%s/", suffix, currentMillisString)))
.withNumShards(1));
This is a similar question, which code I tried to adapt.
FileIO.<EventType, Event>writeDynamic()
.by(
new SerializableFunction<Event, EventType>() {
#Override
public EventType apply(Event input) {
return EventType.TRANSFER; // should return real type here, just a dummy
}
})
.via(
Contextful.fn(
new SerializableFunction<Event, String>() {
#Override
public String apply(Event input) {
return "Dummy"; // should return the Event converted to a String
}
}),
TextIO.sink())
.to(DynamicFileDestinations.constant(new DateNamedFiles("gs://bucket/tmp%s/%s/",
currentMillisString),
new SerializableFunction<String, String>() {
#Override
public String apply(String input) {
return null; // Not sure what this should exactly, but it needs to
// include the EventType into the path
}
}))
.withTempDirectory(
FileBasedSink.convertToFileResourceIfPossible(
String.format("gs://bucket/tmp%s/%s/", suffix, currentMillisString)))
.withNumShards(1))
The official JavaDoc contains example code which seem to have outdated method signatures. (The .via method seems to have switched the order of the arguments). I' furthermore stumbled across the example in FileIO which confused me - shouldn't TransactionType and Transaction in this line change places?
After a night of sleep and a fresh start I figured out the solution, I used the functional Java 8 style as it makes the code shorter (and more readable):
.apply(
FileIO.<String, Event>writeDynamic()
.by((SerializableFunction<Event, String>) input -> input.getTopic())
.via(
Contextful.fn(
(SerializableFunction<Event, String>) input -> input.getPayload()),
TextIO.sink())
.to(String.format("gs://bucket/data%s/", suffix)
.withNaming(type -> FileNaming.getNaming(type, "", currentMillisString))
.withDestinationCoder(StringUtf8Coder.of())
.withTempDirectory(
String.format("gs://bucket/tmp%s/%s/", suffix, currentMillisString))
.withNumShards(1));
Explanation:
Event is a Java POJO containing the payload of the Kafka message and the topic it belongs to, it is parsed in a ParDo after the KafkaIO step
suffix is a either dev or empty and set by environment variables
currentMillisStringcontains the timestamp when the whole pipeline
was launched so that new files don't overwrite old files on GCS when
a pipeline gets restarted
FileNaming implements a custom naming and receives the type of the event (the topic) in it's constructor, it uses a custom formatter to write to daily partitioned "sub-folders" on GCS:
class FileNaming implements FileIO.Write.FileNaming {
static FileNaming getNaming(String topic, String suffix, String currentMillisString) {
return new FileNaming(topic, suffix, currentMillisString);
}
private static final DateTimeFormatter FORMATTER = DateTimeFormat
.forPattern("yyyy-MM-dd").withZone(DateTimeZone.forTimeZone(TimeZone.getTimeZone("Europe/Zurich")));
private final String topic;
private final String suffix;
private final String currentMillisString;
private String filenamePrefixForWindow(IntervalWindow window) {
return String.format(
"%s/%s/%s_", topic, FORMATTER.print(window.start()), currentMillisString);
}
private FileNaming(String topic, String suffix, String currentMillisString) {
this.topic = topic;
this.suffix = suffix;
this.currentMillisString = currentMillisString;
}
#Override
public String getFilename(
BoundedWindow window,
PaneInfo pane,
int numShards,
int shardIndex,
Compression compression) {
IntervalWindow intervalWindow = (IntervalWindow) window;
String filenamePrefix = filenamePrefixForWindow(intervalWindow);
String filename =
String.format(
"pane-%d-%s-%05d-of-%05d%s",
pane.getIndex(),
pane.getTiming().toString().toLowerCase(),
shardIndex,
numShards,
suffix);
String fullName = filenamePrefix + filename;
return fullName;
}
}

In btrace, how can I print a byte array in a readable format?

I want to use btrace to inspect the byte[] value of a method return use the #Return annotation.
The byte array is actually a normal string encoded using utf8.
The class is like below:
Class A {
byte[] method1() {
...
}
}
I have tried printArray, but it only accepts type of Objetc[], not working for type of byte[].
For print, it just outputs the internal object id like '[B#4fbc7b65'.
Is there any other way can solve the problem?
Yes, this is an omission in BTrace (https://github.com/btraceio/btrace/issues/322)
For now, use "trusted" mode where the safety checks will be turned off and you can do eg.
#BTrace(trusted = true)
public class TrustedTrace {
#OnMethod(clazz = "MyClass", method = "m", location = Location(Kind.RETURN))
public static void intercept(#Return byte[] data) {
println(Arrays.toString(data));
}
}

Palindromes: in this program i have to try and figure out if the user input is a palindrome or not

When ever i execute the program the output says not a palindrome when it is a palindrome(only does this when input has spaces or punctuation) can some one tell me where i went wrong in my code?
public class Palindromes
{
public static void main(String[]args)
{
ConsoleIO keyboard=new ConsoleIO();
String word, word2="",terminate;
int length;
do
{
System.out.print("Enter a string:");
word=keyboard.readLine();
word=word.toLowerCase();
word=word.trim();
word=word.replaceAll("\\W", "");
word=word.replaceAll(" ","");
length=word.length();
//finding the reverse of the string
for(int i=length-1;i>=0;i--)
{
word2+=word.charAt(i);
}
//checking to see if the string is a palindrome
if(word.length()==1)
{
System.out.println("The string you entered is not a palindrome");
}
else if(word.equals(word2))
{
System.out.println("The string you entered is a palindrome.");
}
else
{
System.out.println("The string you entered is not a palindrome.");
}
System.out.print("Do you want to continue (yes or no):");
terminate=keyboard.readLine();
System.out.println();
}
while(terminate.equalsIgnoreCase("yes"));
}
}
I think you'd need to account for the punctuation because that will affect the plaindrome test you have. ra.cecar is not otherwise a palindrome. Have you tried adding more lines like the following?
word=word.replaceAll(".", "");
word=word.replaceAll("?", "");
word=word.replaceAll("!", "");
word=word.replaceAll("-", "");
To end all problem to Palindrome, I've made this Java program that will end all suffering to it. It's in Java so you're in luck. It basically strip every non-word character, put it to lower case just with 13 lines. Hope this help haha! Let's hope other guys would get lucky to find this too.
import java.util.Scanner;
public class Palindrome {
public static void main(String[]args){
if(isReverse()){System.out.println("This is a palindrome.");}
else{System.out.print("This is not a palindrome");}
}
public static boolean isReverse(){
Scanner keyboard = new Scanner(System.in);
System.out.print("Please type something: ");
String line = ((keyboard.nextLine()).toLowerCase()).replaceAll("\\W","");
return (line.equals(new StringBuffer(line).reverse().toString()));
}
}

How to specify the thousands and decimal separator used by GWT's NumberFormat

In the doc of GWT's NumberFormat class (http://google-web-toolkit.googlecode.com/svn/javadoc/1.5/com/google/gwt/i18n/client/NumberFormat.html) I read:
"The prefixes, suffixes, and various symbols used for infinity, digits, thousands separators, decimal separators, etc. may be set to arbitrary values, and they will appear properly during formatting. However, care must be taken that the symbols and strings do not conflict, or parsing will be unreliable. For example, the decimal separator and thousands separator should be distinct characters, or parsing will be impossible."
My question is, how do I make sure that "." is used as thousands separator and "," as decimal separator independently of the user's locale settings?
In other words when I use the pattern "###,###,###.######" I want GWT to format the double value 1234567.89 always as "1.234.567,89" no matter what the user's locale is.
Solving this took some work. From the docs and source for NumberFormatter, it looks like only Locales can be used to set these values. They do say you can set the group separator but no such examples worked for me. While you might think the Java way to do this at the bottom would work since GWT emulates the DecimalFormat and DecimalFormalSymbols classes, they do not formally support them. Perhaps they will in the future. Further, they say in the LocaleInfo class that you can modify a locale, I found no such methods allowing this.
So, here is the Hack way to do it:
NumberFormat.getFormat("#,##0.0#").format(2342442.23d).replace(",", "#");
Right way, but not yet GWT supported:
Use the decimal formatter:
// formatter
DecimalFormat format= new DecimalFormat();
// custom symbol
DecimalFormatSymbols customSymbols=new DecimalFormatSymbols();
customSymbols.setGroupingSeparator('#');
format.setDecimalFormatSymbols(customSymbols);
// test
String formattedString = format.format(2342442.23d);
The output:
2#342#442.23
I just came across the same issue. I solved it like this:
public String formatAmount(Double amount) {
String pattern = "#,##0.00";
String groupingSeparator = LocaleInfo.getCurrentLocale().getNumberConstants().groupingSeparator();
String decimalSeparator = LocaleInfo.getCurrentLocale().getNumberConstants().decimalSeparator();
NumberFormat format = NumberFormat.getFormat(pattern);
return format.format(amount).replace(groupingSeparator, "'").replace(decimalSeparator, ".");
}
The way I've used is override the GWT's LocaleInfoImpl in this way:
Taking a look at LocaleInfo you can see it uses a private static instance LocaleInfo line 36 that build using GWT.create(LocaleInfoImpl.class)
Using GWT Deferred binding we can override the LocalInfoImpl by a custom implementation:
<replace-with class="your.app.package.to.CustomLocaleInfoImpl">
<when-type-is class="com.google.gwt.i18n.client.impl.LocaleInfoImpl" />
</replace-with>
Extend the LocaleInfoImpl in a similar way of this, just overriding the method getNumberConstant:
public class CustomLocaleInfoImpl extends LocaleInfoImpl {
#Override
public NumberConstants getNumberConstants() {
final NumberConstants nc = super.getNumberConstants();
return new NumberConstants() {
#Override
public String notANumber() {
return nc.notANumber();
}
#Override
public String currencyPattern() {
return nc.currencyPattern();
}
#Override
public String decimalPattern() {
return nc.decimalPattern();
}
#Override
public String decimalSeparator() {
return nc.decimalSeparator();
}
#Override
public String defCurrencyCode() {
return nc.defCurrencyCode();
}
#Override
public String exponentialSymbol() {
return nc.exponentialSymbol();
}
#Override
public String globalCurrencyPattern() {
return nc.globalCurrencyPattern();
}
#Override
public String groupingSeparator() {
return "#";//or any custom separator you desire
}
#Override
public String infinity() {
return nc.infinity();
}
#Override
public String minusSign() {
return nc.minusSign();
}
#Override
public String monetaryGroupingSeparator() {
return nc.monetaryGroupingSeparator();
}
#Override
public String monetarySeparator() {
return nc.monetarySeparator();
}
#Override
public String percent() {
return nc.percent();
}
#Override
public String percentPattern() {
return nc.percentPattern();
}
#Override
public String perMill() {
return nc.perMill();
}
#Override
public String plusSign() {
return nc.plusSign();
}
#Override
public String scientificPattern() {
return nc.scientificPattern();
}
#Override
public String simpleCurrencyPattern() {
return nc.simpleCurrencyPattern();
}
#Override
public String zeroDigit() {
return nc.zeroDigit();
}
};
}
}

Case-insensitive indexing with Hibernate-Search?

Is there a simple way to make Hibernate Search to index all its values in lower case ? Instead of the default mixed-case.
I'm using the annotation #Field. But I can't seem to be able to configure some application-level set
Fool that I am ! The StandardAnalyzer class is already indexing in lowercase. It's just a matter of setting the search terms in lowercase too. I was assuming the query would do that.
However, if a different analyzer were to be used, application-wide, then it can be set using the property hibernate.search.analyzer.
Lowercasing, term splitting, removing common terms and many more advanced language processing functions are applied by the Analyzer.
Usually you should process user input meant to match indexed strings with the same Analyzer used at indexing; configuring hibernate.search.analyzer sets the default (global) Analyzer, but you can customize it per index, per entity type, per field and even on different entity instances.
It is for example useful to have language specific analysis, so to process Chinese descriptions with Chinese specific routines, Italian descriptions with Italian tokenizers.
The default analyzer is ok for most use cases, and does lowercasing and splits terms on whitespace.
Consider as well that when using the Lucene Queryparser the API requests you the appropriate Analyzer.
When using the Hibernate Search QueryBuilder it attempts to apply the correct Analyzer on each field; see also http://docs.jboss.org/hibernate/search/4.1/reference/en-US/html_single/#search-query-querydsl .
There are multiple way to make sort insensitive in string type field only.
1.First Way is add #Fields annotation in field/property on entity.
Like
#Fields({#Field(index=Index.YES,analyze=Analyze.YES,store=Store.YES),
#Field(index=Index.YES,name = "nameSort",analyzer = #Analyzer(impl=KeywordAnalyzer.class), store = Store.YES)})
private String name;
suppose you have name property with custom analyzer and sort on that. so it's not possible then you can add new Field in index with nameSort apply sort on that field.
you must apply Keyword Analyzer class because that is not tokeniz field and by default apply lowercase factory class in field.
2.Second way is that you can implement your comparison class on sorting like
#Override
public FieldComparator newComparator(String field, int numHits, int sortPos, boolean reversed) throws IOException {
return new StringValComparator(numHits, field);
}
Make one class with extend FieldComparatorSource class and implement above method.
Created new Class name with StringValComparator and implements FieldComparator
and implement following method
class StringValComparator extends FieldComparator {
private String[] values;
private String[] currentReaderValues;
private final String field;
private String bottom;
StringValComparator(int numHits, String field) {
values = new String[numHits];
this.field = field;
}
#Override
public int compare(int slot1, int slot2) {
final String val1 = values[slot1];
final String val2 = values[slot2];
if (val1 == null) {
if (val2 == null) {
return 0;
}
return -1;
} else if (val2 == null) {
return 1;
}
return val1.toLowerCase().compareTo(val2.toLowerCase());
}
#Override
public int compareBottom(int doc) {
final String val2 = currentReaderValues[doc];
if (bottom == null) {
if (val2 == null) {
return 0;
}
return -1;
} else if (val2 == null) {
return 1;
}
return bottom.toLowerCase().compareTo(val2.toLowerCase());
}
#Override
public void copy(int slot, int doc) {
values[slot] = currentReaderValues[doc];
}
#Override
public void setNextReader(IndexReader reader, int docBase) throws IOException {
currentReaderValues = FieldCache.DEFAULT.getStrings(reader, field);
}
#Override
public void setBottom(final int bottom) {
this.bottom = values[bottom];
}
#Override
public String value(int slot) {
return values[slot];
}
}
Apply sorting on Fields Like
new SortField("name",new StringCaseInsensitiveComparator(), true);