How should I determine what is issuing a flush_all command - memcached

We have a memcached server that is shared by about two dozen apps. One of the web apps (or perhaps one of our utility apps) is issuing a flush_all command periodically. The frequency seems random, or at least we haven't seen a pattern yet. It happens about 10 times an hour.
Here's the rub. I can't figure out a good way to figure out which app is doing this. The memcacehd logs are not helpful at all. Here's what I've done so far:
* grep all source code - Other than memcached libraries I can't see anywhere where we issue this command.
* Enable verbose logging (-vv) in memcached - I see the commands get issued, but the log doesn't show any information about where the command is being issued from.
* Research how to administratively disable this; without an unapproved source patch to memcached I can't figure out a good way to do it.
Has anyone else had this problem? I'm assuming that this is coming from one of our web apps, but its possible its from somewhere else too. Any suggestions?
My next step is to setup a second memcached server and move applications one by one (which will be slow and time consuming). There must be a better way.

A little late, but in case anyone else hits this...
I'd suggest you set up multiple memcache proxies and configure each application to use a different one. The first proxy I found was twemproxy, no idea how good it is.
After that you can use the logs for the proxy to identify which application is issuing the commands.

Related

How to make uchiwa dashboard url be able to adjust threshold?

me again..
I had done all the sensu-uchiwa-graphite set up. And i get a new request,:(. Rather than go to change the threshold in check.json file on sensu server..any plugin at the UCHIWA that this adjustment will be shown in Uchiwa dashboard? I asked because in case that my application teams wanna change it by themselves without accessing to server.
I think sensu-admin in enterprise is available but we need to pay big money per year ;(...
Thanks in advance to help.
Sumana W.
This is fairly doable if you use a configuration management system like Chef/Ansible/Puppet - especially if you run standalone checks on the sensu-client.
This allows the clients to define their own thresholds, rather than changing the sensu servers themselves.
See https://sensuapp.org/docs/latest/reference/checks.html#standalone-checks
In this case, the definitions for the checks are sitting on the client servers and they have the choice of their thresholds or configurations. The client itself manages how often to run the check and sends the output back to the server, rather than the server requesting the checks. This helps quite a bit as far as scaling or multitenancy.
The other way to accomplish this, if you are tied to serverside checks, would be to use client attributes (https://sensuapp.org/docs/0.25/reference/checks.html#check-token-substitution)
For example, you can have a cpu check that says something like check-cpu.sh -w :::cpu_warn::: -c :::cpu_critical::: and these come from a cpu_warn and cpu_critical value from the client.json on the client server.
Source: We use sensu extensively in an enterprise environment across thousands of hosts and have been working through these same issues.

What are the limitations of the flask built-in web server

I'm a newbie in web server administration. I've read multiple times that flask built-in web server is not designed for "production", and must be used only for tests and debug...
But what if my app touchs only a thousand users who occasionnaly send data to the server ?
If it works, when will I have to bother with the configuration of a more sophisticated web server ? (I am looking for approximative metrics).
In a nutshell, I would love to find what the builtin web server can do (with approx thresholds) and what it cannot.
Thanks a lot !
There isn't one right answer to this question, but here are some things to keep in mind:
With the right amount of horizontal scaling, it is quite possible you could keep scaling out use of the debug server forever. When exactly you would need to start scaling (or switch to using a "real" web server) would also depend on the environment you are hosting in, the expectations of the users, etc.
The main issue you would probably run into is that the server is single-threaded. This means that it will handle each request one at a time, serially. This means that if you are trying to serve more than one request (including favicons, static items like images, CSS and Javascript files, etc.) the requests will take longer. If any given requests happens to take a long time (say, 20 seconds) then your entire application is unresponsive for that time (20 seconds). This is only the default, of course: you could bump the thread counts (or have requests be handled in other processes), which might alleviate some issues. But once again, it can still be slow under a "high" load. What is considered a "high" load will be dependent on your application and the expectations of a maximum acceptable response time.
Another issue is security: if you are concerned at ALL about security (and not just the security of the data in the application itself, but the security of the box that will be running it as well) then you should not use the development server. It is not ready to withstand any sort of attack.
Finally, the development server could just fail outright. It is not designed to be used as a long-running process (days, weeks, months), and so it has not been well tested to work in this capacity.
So, yes, it has limitations. Yes, you could still conceivably use it in production. And yes, I would still recommend using a "real" web server. If you don't like the idea of needing to install something like Apache or Nginx, you can still go with a solution that is still as easy as "run a python script" by using some of the WSGI Standalone servers, which can run a server that is designed to be in production with something just as simple as running python run_app.py in the command line. You typically just need to create a 4-5 line python script to import and create the server object, point it to your Flask app, and run it.
gunicorn could be run with only the following on the command line, no extra script needed:
gunicorn myproject:app
...where "myproject" is the Python package that contains the app Flask object. Keep in mind that one of developers of gunicorn would probably recommend against this approach. See https://serverfault.com/questions/331256/why-do-i-need-nginx-and-something-like-gunicorn.
The OP has long-since moved on, but for those who encounter this question in the future I would just add that setting up an Apache server, even on a laptop, is free and pretty easy. It can be readily configured for as few or as many features as you want just by uncomment in or commenting out lines in the config file. There might be an even easier GUI method for doing that nowdays, but just editing the configs is simple.

How Do I Optimize Zend Framework

I have a application built on Zend Framework I am trying to optimize.
I did some Xdebug profiling and although i cant say i understand every nitty gritty of the results i got, some things were quite obvious from the result.
For instance, the file Bootstrap.php seems to be the one gulping most of the time taking 4,553MS seconds which accounts for 92.49% of the total time.
And if i dig further, I could see that Zend_Application_Bootstrap_Boostrap->run takes the bulk of the time. Checking this out again, I found out that Zend_Controller_Front->Dispatch might actually be the function inside the Boostrap.php that takes time to execute.
Question is, from these indices that i have, how best can I go about Optimizing the application? If it caching, how do i go about applying Caching to this situation?
Thanks
From the look of the callgrinds, on the login page the app is spending most of it's time in curl_exec, which is to be expected if you're doing a remote login. But it is doing 10 separate curl_execs which seems excessive. I'm not familiar with the LinkedIn login auth, but is it possible your app is running the remote login code multiple times?
On the standard page request the app is spending most of its time connecting to MySQL, and it seems to be doing this twice. Are you using a remote DB server, and do you need two separate DB connections?
Assuming you are using a remote DB server and it is on the same network as your web server, there seems to be some networking issue there. I'd check the latency to that server if you can, and try connecting to the IP address instead of a hostname to see if that makes any difference (if doing this is much faster this would suggest an issue with the DNS setup on your web server).

Perl application move causing my head to explode...please help

I'm attempting to move a web app we have (written in Perl) from an IIS6 server to an IIS7.5 server.
Everything seems to be parsing correctly, I'm just having some issues getting the app to actually work.
The app is basically a couple forms. You fill the first one out, click submit, it presents you with another form based on what checkboxes you selected (using includes and such).
I can get past the first form once... but then after that it stops working and pops up the generated error message. After looking into the code and such, it basically states that there aren't any checkboxes selected.
I know the app writes data into .dat files... (at what point, I'm not sure yet), but I don't see those being created. I've looked at file/directory permissions and seemingly I have MORE permissions on the new server than I did on the last. The user/group for the files/dirs are different though...
Would that have anything to do with it? Why would it pass me on to the next form, displaying the correct "modules" I checked the first time and then not any other time after that? (it seems to reset itself after a while)
I know this is complicated so if you have any questions for me, please ask and I'll answer to the best of my ability :).
Btw, total idiot when it comes to Perl.
EDIT AGAIN
I've removed the source as to not reveal any security vulnerabilities... Thanks for pointing that out.
I'm not sure what else to do to show exactly what's going on with this though :(.
I'd recommend verifying, step by step, that what you think is happening is really happening. Start by watching the HTTP request from your browser to the web server - are the arguments your second perl script expects actually being passed to the server? If not, you'll need to fix the first script.
(start edit)
There's lots of tools to watch the network traffic.
Wireshark will read the traffic as it passes over the network (you can run it on the sending or receiving system, or any system on the collision domain).
You can use a proxy server, like WebScarab (free), Burp, Paros, etc. You'll have to configure your browser to send traffic to the proxy server, which will then forward the requests to the server. These particular servers are intended to aid testing, in that you'll be able to mess with the requests as they go by (and much more)
As Sinan indicates, you can use browser addons like Fx LiveHttpHeaders, or Tamper Data, or Internet Explorer's developer kit (IIRC)
(end edit)
Next, you should print out all CGI arguments that the second perl script receives. That way, you'll know what the script really thinks it gets.
Then, you can enable verbose logging in IIS, so that it logs the full HTTP request.
This will get you closer to the source of the problem - you'll know if it's (a) the first script not creating correct HTML, resulting in an incomplete HTTP request from the browser, (b) the IIS server not receiving the CGI arguments for some odd reason, or (c) the arguments aren't getting from the IIS server and into the perl script (or, possibly, that the perl script is not correctly accessing the arguments).
Good luck!
What you need to do is clear.
There is a lot of weird excess baggage in the script. There seemed to be no subroutines. Just one long series of commands with global variables.
It is time to start refactoring.
Get one thing running at a time.
I saw HTML::Template there but you still had raw HTML mixed in with code. Separate code from presentation.

How to limit the effect of client modifications to production systems

Our shop has developed a few WEB/SMS/DB solution for a dozen client installations. The applications have some real-time performance requirements, and are just good enough to function properly. The problem is that the clients (owners of the production servers) are using the same server/database for customizations that are causing problems with the performance of the applications that we created and deployed.
A few examples of clients' customizations:
Adding large tables with many text datatypes for the columns that get cast to other data types in the queries
No primary keys, indexes, or FK constraints
Use of external scripts that use count(*) from table where id = x, in a loop from the script, to determine how to construct more queries later in the same script. (no bulk actions that the planner can optimize or just do everything in a single pass)
All new code files on the server are created/owned by root, with 0777 permissions
The clients don't take suggestions/criticism well. If we just go ahead and try to port/change the scripts ourselves, the old code can come back, clobbering any changes that we make! Or with out limited knowledge of their use cases, we break functionality while trying to optimize their changes.
My question is this: how can we limit the resources to queries/applications other that what we create and deploy? Are there any pragmatic options in scenarios like this? We prided ourselves in having an OSS solution, but it seems that it's become a liability.
We use PG 8.3 running on a range on Linux Distos. The clients prefer php, but shell scripts, perl, python, and plpgsql are all used on the system in one form or another.
This problem started about two minutes after the first client was given full access to the first computer, and it hasn't gone away since. Anytime someone whose priorities are getting business oriented work done quickly they will be sloppy about it and screw up things for everyone. That's just how things work, because proper design and implementation are harder than cheap hacks. You're not going to solve this problem, all you can do is figure out how to make it easier for the client to work with you than against you. If you do it right, it will look like excellent service rather than nagging.
First off, the database side. There's now way to control query resources in PostgreSQL. The main difficulty is that tools like "nice" control CPU usage, but if the database doesn't fit in RAM it may very well be I/O usage that is killing you. See this developer message summarizing the issues here.
Now, if in fact it's CPU the clients are burning through, you can use two techniques to improve that situation:
Install a C function that changes the process priority (example 1, example 2) and make sure whenever they run something it gets called first (maybe put it into their psql config file, there are other ways).
Write a script that looks for postmaster processes spawned by their userid and renice them, make it run often in cron or as a daemon.
It sounds like your problem isn't the particular query processes they're running, but rather other modifications they're making to the larger structure. There's only one way to cope with that: you have to treat the client like they're an intruder and use the approaches of that portion of the computer security field to detect when they screw things up. Seriously! Install an intrusion detection system like Tripwire on the server (there are better tools, that's just the classic example), and have it alert you when they touch anything. New file that's 0777? Should jump right out of a proper IDS report.
On the database side, you can't directly detect the database being modified usefully. You should do a pg_dump of the schema every day into a file (pg_dumpall -g and pg_dump -s, then diff that against the last one you delivered and again alert you when it's changed. If you manage that this well, the contact with the client turns into "we noticed you changed on the server...what is it you're trying to accomplish with that?" which makes you look like you're really paying attention to them. That can turn into a sales opportunity, and they may stop fiddling with things as much just knowing you're going to catch it immediately.
The other thing you should start doing immediately is install as much version control software as you can on each client box. You should be able to login to each system, run the appropriate status/diff tool for the install, and see what's changed. Get that mailed to you regularly too. Again, this works best if combined with something that dumps the schema as a component to what it manages. Not enough people use serious version control approaches on the code that lives in the database.
That's the main set of technical approaches useful here. The rest of what you've got is a classic consulting client management problem that's far more of a people problem than a computer one. Cheer up, it could be worse--FSM help you if you give them ODBC access and they discover they can write their own queries in Access or something simple like that.