Related
I don't really know how to explain this, so bare with me. But our Facebook pixel detected traffic from another domain. We only have one domain. We went to see what other domain it could possibly be referencing. It turns out, this other domain was a carbon copy of our site. The only thing that was different was the web address. Does anyone have a clue what is going on? It's as though someone is retargeting our customers to a mirrored website.
We tested the foreign site by placing an order using store credit given to ourselves on the backend of our site. The order went through and instead of showing the order was placed in the US, it said it was placed in Turkey.
This is over my head and I have no clue where to start solving this issue.
I've actually seen this happen to someone else before. I'm not sure what the motive behind doing something like this is - but if the orders from the cloned store are being paid to your gateway, then the upside is that you're not losing money over it. However, I do believe that the intent is somewhat malicious.
The most logical reason I have been able to come up with is that if your store has high amounts of traffic, is well known, and has a good SEO rating, the people that are cloning your store are trying to "SEO-Hijack" you in a sense. Essentially piggybacking off of your site because of the SEO ratings it already has in order to boost their own and potentially turn it into a separate store/website later.
This isn't necessarily something that can be fixed by BigCommerce since the copy of your store isn't on the platform whatsoever, since they are essentially just piggybacking off of your SEO rating. The best option here would be to do a domain WHOIS lookup for their domain and report it as fraud to their registrar as an attempt to get legal action to be taken or a cease & desist.
Sorry that this is happening to you!
Here's a helpful explanation that I was able to find and a helpful blog post on how to prevent it and the steps to take.
Oh no, I'm sorry to hear about this! As blurfus suggested above -- Please the BigCommerce Support team to report this as soon as you can. You can find their contact information here: https://support.bigcommerce.com/s/#contact
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I have a serious question. Is it ever ethical to ignore the presence of a robots.txt file on a website? These are some of the considerations I've got in mind:
If someone puts a web site up they're expecting some visits. Granted, web crawlers are using bandwidth without clicking on ads that may support the site but the site owner is putting their site on the web, right, so how reasonable is it for them to expect that they'll never get visited by a bot?
Some sites apparently use a robots.txt exactly in order to keep their site from being crawled by Google or some other utility that might grab prices and therefore allow people to do price comparisons easily. They have private search engines on the site so they obviously want people to be able to search the site; apparently they just don't want people to be able to easily compare their information with other vendors.
As I said, I'm not trying to be argumentative; I would just like to know if anyone has ever come up with a case where it's ethically permissible to ignore the presence of a robots.txt file? I cannot think of a case where it's permissible to ignore the robots.txt mainly because people (or businesses) are paying money to put up their web sites so they should be able to tell the Googles/Yahoos/Other SE's of the world that they don't want to be on their indices.
To put this discussion in context, I'd like to create a price comparison website and one of the major vendors has a robots.txt that basically prevents anyone from grabbing their prices. I'd like to be able to get their information but, as I said, I can't justify simply ignoring the wishes of the site owner.
I have seen some very sharp discussion here and that's why I would like to hear the opinions of developers that follow Stack Overflow.
By the way, there is some discussion of this topic on a Hacker News question but they seem to mainly focus on the legal aspects of this.
Arguments:
A robots.txt file is an implied license, especially since you are aware of it. Thus, continuing to scrape their site could be seen as unauthorized access (i.e., hacking). Sucks, but arguments like this have been made in other legal cases recently (not directly related to robots.txt, but in relation to other "passive controls".)
Grabbing prices violates no copyright law, including DMCA, since copyright does not include factual information, only creative.
Ethically, you should not grab prices because the vendor should have the ability to change prices without worrying about being accused of a bait/switch by people coming from your site.
Have you taken the high road, explaining the site to them and saying you'd love to include them in your list of vendors? Maybe they will love the idea and actually expose the data in a way that is easy for you to consume and less resource-intensive for them to produce.
There are no laws written directly about robots.txt because netiquette is generally followed. Don't be one of the "bad guys."
Some people filter robots because they use URL links to perform "actions" like adding things to carts, and robots leave them with massive numbers of abandoned shopping carts in their database.
Some people filter robots because they have exclusive prices that they can't advertise openly based on agreements with their vendors. You could be putting them in a bad position by exposing those prices on your site.
In this economy, if a company doesn't want to do everything possible to advertise themselves, it's their own fault that you don't include them.
The other use of robots.txt is to help protect web spiders from themselves. It's relatively easy for a web spider to get mired in an infinitely deep forest of links, and a properly constructed robots.txt file will tell the spider that "you don't need to go here".
Many people have tried to build businesses off building "price comparison" engines that scraped major sites.
Once you start getting any sort of traffic/revenue to speak of, you will receive a cease and desist. It's happened to dozens, if not hundreds of projects. I even worked on a small project that received a C&D from Craigslist.
You know how they say "It's easier to ask forgiveness than it is to get permission"? It doesn't hold true with page scraping. Get permission, or you will be hearing from their lawyers.
If you're lucky, it'll be early on, when you've got nothing to lose. If it's late, you may lose your business and all your work overnight, with a single letter.
Getting permission shouldn't be hard. Unless you're doing something sneaky, you're likely going to drive them additional traffic. Hell, once your product takes off, sites may be begging you, or even paying you to add their data.
One reason we allow robots to dig through the web without complaint is that we have a way to stop them if we want to. Protects both sides.
Remember the uproar when Cuil's robots were accused of going over-the-top, apparently acting like a DoS attack in some cases and using up bandwidth allowances of some small sites?
If too many people violate robots.txt we might get something worse.
"No" means "no".
To answer the narrow question, for the price comparison website you're probably best grabbing the price in real time, rather then scrapping the database in advance. Hard to imagine that being a problem.
An interesting IRL version of story involving The Harvard Coop:
Coop Calls Cops On ISBN Copiers.
Short answer: No.
On the narrow issue: If a seller says that their prices are secret, I think you have to respect that. I'd contact them and ask if they really don't want price comparison engines like yours to include them, or if the "no trespassing" sign is for technical reasons. If the latter, perhaps they'll provide you with an alternative. If the former, then I'd say too bad, they don't get included, they lose some business, and it's their problem.
Tangential rant: Personally, I get pretty annoyed with companies that make me jump through hoops to find out the price of their products, places that make me call and talk to a salesman so he can give me a hard-sell pitch, or worse, make me give them my phone number so their salesman can call and harass me. I figure that if they're afraid to tell me the price, it probably means that it's too high.
In general: A robots.txt file is like a "No Trespassing" sign. It's the owner's right to say who is allowed on their property. If you think their reasons are dumb, you can politely suggest they take the sign down. But you don't have the right to disregard their wishes. If someone puts a No Trespassing sign on his yard, and I say, "Hey, I just want to take a quick short cut, what's the big deal?" -- Maybe I'm stepping on his prized Bulgarian violet bulbs and destroying a valuable investment. Maybe I'm crossing his people's sacred burial ground and offending their religious sensibilities. Or maybe he's just an ornery jerk. But it's still his property and his right. Oh, and if I fall into the dangerous sinkhole after ignoring the No Trespassing sign, who's to blame? (In America, I could probably still sue him for all he's worth despite the fact that he warned me, but is that right?)
I'm showing some ignorance here, but I always thought a bot was something only sent out by a search engine. Like Google or Yahoo.
Thus, if you wrote an application that searched content on the internet, I wouldn't consider that a search engine bot, which to my knowledge is what robots.txt is trying to block.
But this may just be selective ignorance, because I might do it until the webmaster of that site contacted me and asked me to stop :)
If people make it available to public access, they shouldn't try to put limits on it. Adding a robots.txt file to your site is the equivalent to putting a sign on your lawn that says "Please don't look at me."
I've finally found a client for my hosted software - the first time I've ever sold software. I want both parties to sign a contract specifying things like expected uptime, payment schedules, etc., so that no one feels like they've been cheated, but I'm not a lawyer and can't really afford one right now. Does anyone know how to start with this process?
TIA.
When writing a legal agreement of that type, it is hard to give any other answer than "get a lawyer."
Repeat exactly what you have just said to us to your customer. Start a conversation, and take it from there.
In my experience, they will likely appreciate your openness and desire to make sure all parties interests are protected.
If they are large, they may have contracts/SLAs that you can use as a template, and if they are small they may agree to settle for some very simple terms. There are also many templates on the internet available for a small charge.
But as Robert says, you really shouldn't sign anything legally binding without some input from a qualified solicitor. Further, it can't be your run of the mill high street lawyer either. You really need somebody that deals with software/SLA's on a regular basis. Sadly in my experience these people are never cheap.
Allen
I dont know what to say. About 3 days ago I released a script to the public. Today I realised, after searching on google that someone had already nulled (removed my protection) and pirated the script.
How do I stop users from pirating the script? It is written in PHP.
Please help or suggest some solutions.
Thank you for your time.
UPDATE By releasing to the public means that I have started selling it to users.
UPDATE My program is priced at only $49. Very reasonable for the functionality it offers. I do not understand how I should stop pirates from pirating my code. The replies which most people have given are rather sarcastic. I was hoping for some good advice. I know there is no silver-bullet. But some techniques which you have used in your PHP programs.
The only real way to prevent piracy is to not give the user the program at all! What I mean by this is have the logic you want to protect remain server side and offer a client interface.
There are a few companies that offer protection services, but these are expensive and can sometimes still be overcome.
If you're worried about this happening again, try obfuscating your code. Here is a free program to do just that on PHP code.
I'm not trying to be sarcastic here: forget about them. Here's my rationale:
You can spend tons of time trying to
prevent pirates from pirating your
stuff, or you can spend the same
amount of time giving your paying
users more functionality.
Extreme copy protection does not give your paying users anything but more
hoops to jump through to use your
application - which might lead them
to get frustrated.
Pirates will pirate your applications
no matter how much time you spend
trying to stop them.
Budget a certain
amount of time to put in basic copy
protection - just enough to keep the
honest people honest.
Most importantly: Don't irritate your paying customers.
They are the ones you need to make
happy.
There's not much you can do.
Be flattered your work was deemed worth the effort!
How do I stop users from pirating the
script?
Do not release sensible source code to the public...
[EDIT] After a few downvotes, I decided to comment on my answer:
Any code that is released public has a chance of being hacked. This is the number one reason why Javascript is not secure. No matter how much you will obfuscate it, compress it or translate it to some random japanese dialect, it is still source code that the user has access to. Hence it should not contain any sensible information such as passwords or such. All sensible data should be stored in the server side where it is kept hidden from the user.
If you are releasing a php framework containing both the server and client code; then you have no way of fully protecting yourself. PHP is, like Javascript, an interpreted language. You may translate it, compress it, or obfuscate it as much as you want, (and it's probably the best thing you can do) you will never fully protect it when released to the public.
Again... If there was a magic way to prevent code from being broken, it would have been known for a long time. No-cd patches / cracks for new games/softwares now are almost released the same day as the softwares themselves. It is, as noted by Paul, a form of flattery for you, even though I understand how sorry you may feel.
There are a few instances where programmers ended up with bullet-proof protection, but it usually involved high-end engineering.
With PHP, you're mostly out of luck. It's an interpreted language, which means that you are essentially forced to give away the source code. Sure, there are obfuscators (tools that "scramble" the source code to make it near impossible to read for humans), but they can be circumvented as well.
There are product like Zend Guard which seem to offer a better level of protection, but from my understanding, your customers need Zend Guard installed as well, which is almost never the case.
There are several methods of handling this:
Offer your product as a service. This means finding appropriate hosting in the cloud, etc. This removes access to your code base, thus preventing direct piracy. Someone can still reverse engineer your stuff, but I'll touch on that later.
Add a unique identifier to each version of the script sold. This can be done automatically, and is great to do with obfuscated code (another, complementing method). This will give you the ability to track whoever pirated your code. If you can track them, you can sue them (or worse).
Pursue legal action. You'll need to know who leaked the code in the first place for this. Their PayPal information or even an IP address should be enough. You go to your lawyer, ask him to get a court order telling PayPal/ISP to release the identity of the thief, and then start tracking them down. If they're located overseas, your only real option is to freeze/appropriate funds from PayPal/credit card. Banks will be sympathetic only if they have a branch in your country (which can be targeted for legal action).
Ignore it, and simply build your business model around the support that you offer.
The sad fact is that information cannot be secured completely. There is no way to prevent a team of Indian programmers from reverse engineering your program. So you just have to be better than them, and constantly improve your product (this is "A Good Thing (TM)", so do it anyways)
Also keep in mind that DRM and other solutions are often controversial, and will reduce your sales (especially among early-adopters). On a personal level, I would suggest viewing this as a compliment. After all, your script was useful enough that someone bothered to pirate it within a week!
PHP is easily decoded, so for people who really want to know, it's easy to find out the source code. However, there are certain obfuscator programs such as this one that'll make your PHP script almost unreadable for those trying to decode it.
What kind of protection did you think you had added to a PHP script, anyway? You should add a line of the form:
if ($pirated)
exit();
and then make it mandatory (in the licence agreement) that users set the $pirated variable accordingly.
Forget trying to prevent it
Go the way of CakePHP (see sidebar on front page) and many other open source projects and ask for donations.
People actually do it!
Contact the pirate and let h{im,er} know that you will be forced to take legal action against them if they do not abide by the license.
I agree with #Michael.
Try ionCube or Zend Guard. They are both commercial offerings, but you say that you are selling your software so it might be worth it. Although nothing is foolproof and can be reverse engineered with enough effort and technical skill, these solutions are probably good enough for the average PHP script vendor.
I agree with Samoz's suggestion to keep the logic server side, however this can often be hard to do. The best strategy is to make the user want to buy it by offering updates automatically to registered users, as well as installation, advice and good support. You are never going to sway people hell bent on pirating, however your goal should be to persuade those who are undecided as to whether to pirate or purchase the script.
Any obfuscation/decryption technique for PHP can be cracked
Jumping in very late to this conversation, but saw this question featured. Nobody mentioned contacting a lawyer and pursuing litigation. You likely saw the script on a server - hosted by a known hosting company - you can probably get a DMCA takedown to have the script removed. If you really press the case, you may be able to sue for damages.
Found this link to assist in going this route:
http://www.keytlaw.com/Copyrights/cheese.htm
You could always pirate it yourself to the internet and hope that any nuller will think "its already been grabbed" so don't bother. But pirate a real buggy version. When users come to you looking for help you'll know they have a pirate version if they question you about specific bugs you purposely added and you can approach them accordingly
If your script won't consume a lot of bandwidth, you could keep your "logic" server-side, as samoz suggested, but if your users won't use it responsively ( a crawler, for example ), this could be trouble.
On the other side, you could become a ninja ...
Attach a copyright notice to it. Some companies will actually care that they're using software properly.
Actually I think it's easier to protect PHP scripts than desktop software, because with latter you never know who is running the cracked copy.
In case of PHP on the other hand, if people run your software on public web servers, you can easily find them and take them down. Just get a lawyer and turn them in to the police. They could also be breaking DMCA laws if they remove your protection so that gives you even more ammunition.
Technical way to protect your code is obfuscation. It basically makes your code unreadable like binaries in compiled languages (like Java). Of course reverse engineering is possible, but needs more work.
In general it's hard to prevent users from stealing code when the program is written in a scripting language and distributed in plain text. I've found that http://feedafever.com/ did a really nice job of being able to sell PHP code but still give the code to users.
But the solution to your problem is very dependent on the domain of your program. Does this script run on the users machine with no internet connection? Or could this be a hosted service?
I'd also suggest looking at some of your favorite software, and seeing how they convinced you to pay for it initially. The issue I find isn't always "how can I prevent my users from stealing my software" but sometimes more "how do I convince my users that it's in their best interests to pay me". Software piracy often comes when your product is overpriced (Ask your friends what they would pay for a software package like the one you are selling, I've found that I have historically overpriced my software by 20%).
Anyway, I hope this helps. I'm glad that you are trying to create software that is useful to users and also not incredibly crippled. I personally of the mind that all software that isn't shrink wrapped or SAAS should be free, but I totally understand that we all need to eat.
The trick is not to try to prevent the piracy (in the long term, this is a losing battle), but to make the legitimate version of your product more accessible and/or more functional than the pirate versions.
"Making it more functional" generally means providing involves additional features or services to registered users, which cannot be replicated for free by the pirates. This may be printed materials (a users manual, a gift voucher, etc), services such as telephone support or help setting the product up, or online extras within the software.
I'll point out that companies such as RedHat are able to make significant amounts of money selling open source software. The software itself is freely available -- you can download it and use it for free without paying RedHat a penny. But people still pay them for it. Why? Because of the extra services they offer.
"Making it more accessible" means making it easier to get your legitimate software than a pirate copy. If someone visits Google looking for your software and the first result is a pirate download site, they'll take the pirate copy. If the first result is your home page, they're more likely to buy it. This is especially important for low-cost software: pirated software may be 'free', but usually it takes more effort to get. If that effort is outweighed by the low cost and lack of effort of simply buying it legitimately, then you've won the battle.
I saw anti-piracy working once only. Quantel EditBox systems (a post-processing video solution), Hardware+Software+Internet solution against Piracy. Workstation only works after checking if the bank received the monthly rent. If not, workstation was locked. Funny days when this happens... (Funny days for me, no work at all... No funny day for the hacker.)
Well, PHP is far away from hardware solutions... so I guess your only real choice is a server side protected against a tiny unsafe client pushing content, as pointed in some answer yet.
piracy != copyright infringement
There are known routes to litigate copyright infringers.
Does it really matter enough to hire a legal team?
Obfuscation do add something. It will not be fun to try to modify your code at least even if they can take the first version of it. In best case they will try to find some open source project that does something similar. Guess this would give you an fast fix at least for your problem?
As I'm starting to develop for the web, I'm noticing that having a document between the client and myself that clearly lays out what they want would be very helpful for both parties. After reading some of Joel's advice, doing anything without a spec is a headache, unless of course your billing hourly ;)
In those that have had experience,
what is a good way to extract all
the information possible from the
client about what they want their
website to do and how it looks? Good
ways to avoid feature creep?
What web specific requirements
should I be aware of? (graphic
design perhaps)
What do you use to write your specs in?
Any thing else one should know?
Thanks!
Ps: to "StackOverflow Purists" , if my question sucks, i'm open to feed back on how to improve it rather than votes down and "your question sucks" comments
Depends on the goal of the web-site. If it is a site to market a new product being released by the client, it is easier to narrow down the spec, if it's a general site, then it's a lot of back and forth.
Outline the following:
What is the goal of the site / re-design.
What is the expected raise in customer base?
What is the customer retainment goal?
What is the target demographic?
Outline from the start all the interactive elements - flash / movies / games.
Outline the IA, sit down with the client and outline all the sections they want. Think up of how to organize it and bring it back to them.
Get all changes in writing.
Do all spec preparation before starting development to avoid last minute changes.
Some general pointers
Be polite, but don't be too easy-going. If the client is asking for something impossible, let them know that in a polite way. Don't say YOU can't do it, say it is not possible to accomplish that in the allotted time and budget.
Avoid making comparisons between your ideas and big name company websites. Don't say your search function will be like Google, because you set a certain kind of standard for your program that the user is used to.
Follow standards in whatever area of work you are. This will make sure that the code is not only easy to maintain later but also avoid the chances of bugs.
Stress accessibility to yourself and the client, it is a big a thing.
More stuff:
Do not be afraid to voice your opinion. Of course, the client has the money and the decision at hand whether to work with you - so be polite. But don't be a push-over, you have been in the industry and you know how it works, so let them know what will work and what won't.
If the client stumbles on your technical explanations, don't assume they are stupid, they are just in another industry.
Steer the client away from cliches and buzz words. Avoid throwing words like 'ajax' and 'web 2.0' around, unless you have the exact functionality in mind.
Make sure to plan everything before you start work as I have said above. If the site is interactive, you have to make sure everything meshes together. When the site is thought up piece by piece, trust me it is noticeable.
One piece of advice that I've seen in many software design situations (not just web site design) relates to user expectations. Some people manage them well by giving the user something to see, while making sure that the user doesn't believe that the thing they're seeing can actually work.
Paper prototyping can help a lot for this type of situation: http://en.wikipedia.org/wiki/Paper_prototyping
I'm with the paper prototyping, but use iplotz.com for it, which is working out fine so far from us.
It makes you think about how the application should work in more detail, and thus makes it less likely to miss out on certain things you need to build, and it makes it much easier to explain to the client what you are thinking of.
You can also ask the client to use iplotz to explain the demands to you, or cooperate in it.
I also found looking for client questionnaires on google a good idea to help generate some more ideas:
Google: web client questionnaire,
There are dozens of pdfs and other forms to learn from