What is nvOpzp; AND 1=1 OR (<'">iKO))? - sql-injection

I am seeing queries to a public search box on a website several times daily:
'nvOpzp; AND 1=1 OR (<'">iKO)),
What is going on here?

It's just an attempt to see if the code will error out. Two kinds of quotes just in case the other kind is not escaped. The nonsense text to look for in the output - if it's found then likely there is a possible injections.
Possibly a test for XSS too.
This code is looking for three different flaws:
the data is not properly treated before use - escaped for HTML, prepared for SQL
the errors are shown on the screen instead of being logged
the adherence to the above rules is arbitrary and not mandatory
Several times a day simply because the process is mindlessly automated, a range scan. People doing that are usually called script-kiddies - knowing actually nothing they just run some premade software.

Related

Using UIMA Ruta: How do I annotate the first token of a text and use that annotation further?

I would like to annotate the first token of a text and use that annotation in following rules. I have tried different patterns:
Token.begin == 0 (doesn't work, although there definitely is a token that begins at 0)
Token{STARTSWITH(DocumentMetaData)}; (also doesn't work)
The only pattern that works is:
Document{->MARKFIRST(First)};
But if I try to use that annotation e.g. in the following way:
First{->MARK(FirstAgain)};
it doesn't work again. This makes absolutely no sense to me. There seems to be a really weird behaviour with annotations that start at 0.
This trivial task can be a bit tricky indeed, mainly because of the visibility settings. I do not know why your rules in the question do not work without having a look at the text that should be processed.
As for UIMA Ruta 2.7.0, I prefer a rule like:
# Token{->First};
Here some additional thoughts about the rules in the question:
Token.begin == 0;
Normally, there is not token with begin at 0 since the document starts with some whitespaces or line breaks. If there is actually a token that starts at offset 0 and the rule does not match, then something invisible is covering the begin of the end of the token. This depends of course of the filtering settings, but in case that you did not change them, it could be a bom.
Token{STARTSWITH(DocumentMetaData)};
Here, either the problem above applies, or the begin offset is not identical. If the DocumentMetaData covers the complete document, then I would bet on the leading whitespaces. Another reason could be that the internal indexing is broken, e.g., the tokens or the DocumentMetaData are created by an external analysis engine which was called with EXEC and no reindexing was configured in the action. This situation could also occur with unfortunate optimizations using the config params.
Document{->MARKFIRST(First)};
First{->MARK(FirstAgain)};
MARKFIRST creates an annotation using the offset of the first RutaBasic in the matched context IIRC. If the document starts with something invisible, e.g., a line break, then the second rule cannot match.
As a general advice in situations like this when some obvious simple rules do not work correctly as expected, I recommend adding some additional rules and using the debugging config with the explanation view. As rule like Token; can directly highlight if the visibility setting are problematic for the given tokens.
DISCLAIMER: I am a developer of UIMA Ruta

G-WAN: how to get rid of the "?" in URL and how to set default language?

In G-WAN the default URL is in the form mydomain.com/?hello.c
I want to get rid of the ? to have URLs that look like mydomain.com/hello
The user manual mentions substituting a different character such as ' for ?. In that case the URL would look like mydomain.com/'hello.c
But I don't want to use a different character, I want to get rid of the special character completely. Is that possible?
The default language for G-WAN is C. So mydomain.com/?hello means mydomain.com/?hello.c
How do I change the default to a different language, say Java, so that mydomain.com/?hello now means mydomain.com/?hello.java
Can I set different default languages for different virtual hosts?
Finally, how do I change the URL format for passing parameters? According to the user manual the default format is:
mydomain.com/?hello.c&name=Eva
I want to change it to:
mydomain.com/hello?name=Eva
Is that possible?
This has already been asked many times, and a few solutions are found here:
G-WAN handler rewriting solution
You should note, however, that the way you mean to pass arguments as ?something=answer instead of & only applies to the first argument passed. You can't do ?this=that?somethingelse=this because only the first can be ? and the rest must be &. In fact you can ignore using ? completely and only use & with virtually unlimited arguments so it's in fact better to stick to only using &.
It's important to note for future reference to anyone asking similar questions, G-WAN gives you the entire headers through multiple steps of the HTTP transaction and being that you can modify them with c/c++, you can change anything at all that you want before the requests are handled by the server or sent back to the client. The only limitation is your knowledge and imagination.

How to generate stream of custom symbols instead of ints in lexer?

I am asking for gplex, however it might be the case, the solution to the problem works for other lex-derived tools.
I wrote all rules, everything is fine with one exception. The type of the scan method of the generated scanner is int, and I would like to be MySymbol (which would consist of id of the token -- INT, STR, PLUS, so on, its value, and possible location in the file).
I checked the samples (not many of them), but they are very simplistic and just write out the fact rule was matched, I've read the manual, but it starts from parser perspective and for now I am a bit lost.
One of my rules in lex file:
while { return new MySymbol(MyTokens.WHILE); }
All I have now is scanning phase, I have to finish it, and then I will think about parser.
Yacc and Yacc-like tools (here GPLex) relies on side effect. Normally you could think of returning the data, but here you are returning token id, and any extra data has to be "passed" via special variables like yyval.

How to prevent SQL injection if I don't have option to use "PreparedStatement" in Java/J2EE

I have one application In which I can’t user “PreparedStatement” on some of places.
Most of SQL queries are like….
String sql = "delete from " + tableName;
So I like to know how to fix “SQL Injection” problem in my code.
Regards,
Sanjay Singh
=======================Edited After getting answer and like to verify solution==========
According to provided suggestion I have identified one strategy to prevent SQL injection in my case ….
Like to know views, I am working on the VeraCode Certificate for our application…
Filter Data so it does not content any space and escape SQL character (so if there is any injected code,
it’ll not going to part of my dynamic SQL, so my column name and table name can’t use to inject SQL query).
public static String getTabColName(String tabColName)
{
if(tabColName == null || "".equals(tabColName.trim()))
return "";
String tempStr = StringEscapeUtils.escapeSql(tabColName.trim());
//If this value content space that means it is not a valid table
// or column name, so don’t use it in dynamic generated SQL
//use space so it create an invalid SQL query
return tempStr.indexOf(' ') == -1 ? tempStr : "";
}
Parameterised queries are a major step towards preventing SQL injection attacks. If you cannot use them, you have an equally major setback in your hands. You can somewhat mitigate the danger by:
input string validation. And I mean validation with all the bells and whistles, which can sometimes reach the level of a full-blown parser, not just a few checks.
input manipulation (e.g. quoting and string escaping). Again, you have to do this right, which can be harder than it seems.
Both techniques are problematic - you have to let valid input through unchanged, in order to maintain compatibility with your current codebase, while still effectively protecting your system. Good luck with that...
From my experience, refactoring - or even rewriting - your code to use prepared statements will save you a lot of time and tears in the long run.
If you don't have a peer-reviewed library of string-escaping functions, at the very least you should white-list characters that you know are safe to embed in strings. For instance, ensure your strings are composed only of letters, digits and underscores, and nothing else. Black-listing known "bad characters" is poised to get you in trouble.
Making sure that the input contains only allowed characters is just an important first step. Your sample statement is a good example for the value of the strategy "find input in a list of all good values" (you surely know the set of tables in your database and the subset of tables users are allowed to zap). "compare input against plausible range" (salary shouldn't be increased by millions or half cents), or "match input against a regex to reveal structural violations" are further examples.
To get confidence in your defenses, you may consider using a QuickCheck-like testing library to attack your validation functions by (suitably biased) random strings. This article lists implementations for languages other than Haskell.

Using table-of-contents in code?

Do you use table-of-contents for listing all the functions (and maybe variables) of a class in the beginning of big source code file? I know that alternative to that kind of listing would be to split up big files into smaller classes/files, so that their class declaration would be self-explanatory enough.. but some complex tasks require a lot of code. I'm not sure is it really worth it spending your time subdividing implementation into multiple of files? Or is it ok to create an index-listing additionally to the class/interface declaration?
EDIT:
To better illustrate how I use table-of-contents this is an example from my hobby project. It's actually not listing functions, but code blocks inside a function.. but you can probably get the idea anyway..
/*
CONTENTS
Order_mouse_from_to_points
Lines_intersecting_with_upper_point
Lines_intersecting_with_both_points
Lines_not_intersecting
Lines_intersecting_bottom_points
Update_intersection_range_indices
Rough_method
Normal_method
First_selected_item
Last_selected_item
Other_selected_item
*/
void SelectionManager::FindSelection()
{
// Order_mouse_from_to_points
...
// Lines_intersecting_with_upper_point
...
// Lines_intersecting_with_both_points
...
// Lines_not_intersecting
...
// Lines_intersecting_bottom_points
...
// Update_intersection_range_indices
for(...)
{
// Rough_method
....
// Normal_method
if(...)
{
// First_selected_item
...
// Last_selected_item
...
// Other_selected_item
...
}
}
}
Notice that index-items don't have spaces. Because of this I can click on one them and press F4 to jump to the item-usage, and F2 to jump back (simple visual studio find-next/prevous-shortcuts).
EDIT:
Another alternative solution to this indexing is using collapsed c# regions. You can configure visual studio to show only region names and hide all the code. Of course keyboard support for that source code navigation is pretty cumbersome...
I know that alternative to that kind of listing would be to split up big files into smaller classes/files, so that their class declaration would be self-explanatory enough.
Correct.
but some complex tasks require a lot of code
Incorrect. While a "lot" of code be required, long runs of code (over 25 lines) are a really bad idea.
actually not listing functions, but code blocks inside a function
Worse. A function that needs a table of contents must be decomposed into smaller functions.
I'm not sure is it really worth it spending your time subdividing implementation into multiple of files?
It is absolutely mandatory that you split things into smaller files. The folks that maintain, adapt and reuse your code need all the help they can get.
is it ok to create an index-listing additionally to the class/interface declaration?
No.
If you have to resort to this kind of trick, it's too big.
Also, many languages have tools to generate API docs from the code. Java, Python, C, C++ have documentation tools. Even with Javadoc, epydoc or Doxygen you still have to design things so that they are broken into intellectually manageable pieces.
Make things simpler.
Use a tool to create an index.
If you create a big index you'll have to maintain it as you change your code. Most modern IDEs create list of class members anyway. it seems like a waste of time to create such index.
I would never ever do this sort of busy-work in my code. The most I would do manually is insert a few lines at the top of the file/class explaining what this module did and how it is intended to be used.
If a list of methods and their interfaces would be useful, I generate them automatically, through a tool such as Doxygen.
I've done things like this. Not whole tables of contents, but a similar principle -- just ad-hoc links between comments and the exact piece of code in question. Also to link pieces of code that make the same simplifying assumptions that I suspect may need fixing up later.
You can use Visual Studio's task list to get a listing of certain types of comment. The format of the comments can be configured in Tools|Options, Environment\Task List. This isn't something I ended up using myself but it looks like it might help with navigating the code if you use this system a lot.
If you can split your method like that, you should probably write more methods. After this is done, you can use an IDE to give you the static call stack from the initial method.
EDIT: You can use Eclipse's 'Show Call Hierarchy' feature while programming.