I want to have 2 versions of Gensim for using summarization and keyword function from old Gensim.
How can I setup this senario?
In general, a single Jupyter notebook is backed by a single Python interpreter/environment, and popular packages at their 'official' installation paths can only be installed once.
There are a few hackish workarounds suggested in answers like:
Installing multiple versions of a package with pip
However, each workaround presents operational problems.
One approach is to install the older package to a non-standard path (directory) that's still found by Python importing logic (controlled by PYTHONPATH). For example, put/move the older copy of Gensim to a gensim_old package directory. But: this is only likely to work well with very sime (single-.py-file) packages.
With any signficant library (like Gensim) which cross-imports a lot of things from its own utility modules, using the standard paths, lots of things are likely to break unless you dig into all involved individual files to change their import paths. That's kind of kludgey & hard-to-maintain. (Though, to the extent you're just using one old version, say gensim-3.8.3 for the removed summarization feature, perhaps it'd be worth fighting through this process once, then keeping the changes around.)
Another approach is to create a totally-separate Python environment with the alternate version, and only use that other environment from the notebook by a system-call – via either something in Python-code like subprocess.call(), or the notebook-cell ! or !! magic-escapes to run a shell command. That is, you give up the ability to run individual interactive lines of Python in that alt environment - but could still send it batches of data, and either capture the console output or observe its output files to continue processing in your notebook.
I'd expect this to be a better option – cleaner & more-maintainable – provided that either the old-version-functionality (summarization) or new-version-functionality (whatever else) can be condensed into one (or a few) single-step scripts.
Another option would be to try to completely copy the gensim.summarization source code files to some new location inside your own project – performing whatever (few, minor) edits are necessary to ensure it works from the alternate location.
One of the reasons that functionality was removed was that its approach to things like tokenization was not consistent/integrated with other Gensim practices – which actually means it's likely to be a little easier to keep it working (given its use of its own idiosyncratic approaches) separately.
Personally I'd rank these three options desirability as:
(best) Section off the summarization tasks to be run via subprocess executions in a separate Python environment, which has only the older package installed.
(maybe ok) Copy the 10 .py files that implement the gensim.summarization' to your own local module. Edit lightly as necessary to ensure they still work. (That should mainly be updating import` lines, but might reuire a few other adaptations to other Python 3.x/Gensim 4.x changes.)
(probably too messy) Install the whole old package to a non-standard directory, edit lots of files to ensure anything you're using still works.
Finally, note that the main reason the feature was removed is that it did not offer very impressive or adaptable results. While I've seen some people say it's worked OK for their applications, I've never seen even so much as a demo where its practices/algorithm – which can only extract some subset of important sentences, never paraphrase – gave impressive results.
So unless you already know that its approach works well for your needs, don't get your hopes up! Good luck.
My understanding is that splatting variables is the preferred/recommended way to make longer function calls in PowerShell scripts. However, I use vscode as my primary IDE and understandably, extracting the parameters into a hashtable and splatting them makes intellisense unusable.
Is there any library/framework/vscode extension for splatting that allows the use of intellisense by way of naming convention or something like that?
I know this is a relatively old question but whilst looking for similar I came across Editor Services Command Suite which looks like it may be useful. It allows you to write the command out and then convert it to a splatted version:
ESCS github repo
It was this blog post by Rob Sewell (sqldbawithabeard) which brought it to my attention: blog post
I've been learning Swift for a little bit now, and I really want to use it. However, compiling Swift on Windows is quite a chore. I can do it from Visual Studio 2015 easily, but VS2015 support is very poor, and incredibly hard to work with; I would prefer to use Atom and the command line. I use RemObject's Silver to compile to .NET, but I can never get the command line to work.
When I use the Elements.exe at C:\Program Files (x86)\RemObjects Software\Elements\Bin, passing the filename as a parameter, it tells me that print is undefined.
Does anyone know how to use Silver's command line to compiler Swift?
Have you tried referencing the online documentation?
calling elements needs some libraries to import:
elements --mode=ECHOES --reference=Swift myfile.swift
I recommend using MSBuild, rather than Elements.exe, to build Elements projects from the command line. It's more complete and supports more advanced build tasks needed for some project types.
That said, --mode and --reference should definitely work. Can you post a complete command line and output?
You will need --reference=Swift in order for print() & co to be known, and --reference System.Core to use LINQ.
I'm working on writing a Perl script that provides an interface to WinCvs for all of my Perl scripts to use. While trying to implement the diff command I came across an option that looks incredibly useful (if it does what I think it does) but I can't figure out how to use it.
Here is the "usage:" output for the WinCvs diff command.
My question is what is how do you use the -3 option?
The online CVS documentation doesn't even reference this option and I can't figure out a way to make it produce a useful output.
I tried using it like this cvs diff -3 -r<revision1> -r<revision2> -r<revision3> but I just get the following error.
Using this command with only 2 revisions seems to give the same output as if I didn't use -3. Am I missing something? Or is this command just inherently broken in WinCvs?
We're struggling to come up with a command name for our all purpose "developer helper" tool, which we are using on our project. It's like a wrapper for our existing tools like cmake and hg. The purpose of the command is really just to make our lives easier by combining multiple commands into one (for example, publishing packages). For example, we have commands like:
do conf
do build
do install
do publish
We've considered a few ambiguous names like do (as above) and run, but obviously, do is a Linux bash command and run is pretty ambiguous.
We'd like our command to be 2 chars short, preferably - but who thinks we're asking the impossible? Is there a practical way to check the availability of command names (other than just typing them into your terminal), or is it just a case of choose one and hope nobody else will use it? Are we worrying about nothing?
Since it's a "developer helper" tool why not use hm [run|build|port|deploy|test], Help Me ...
Give it a verbose name, then let everyone alias it to whatever they want. Make sure you use the verbose name in other scripts so that it removes ambiguity.
This way, each user gets to use whatever makes sense to him/her, and the scripts are more readable and more easily searchable (for example, grepping four "our_cool_tool" will usually yield better results than grepping for "run").
How many 2-character words are useful in this context? I think you need four. With that in mind, here are some suggestions.
omni
torq
fluf
mega
spif
crnk
splt
argh
quat
drul
scud
prun
sqat
zoom
sizl
I have more if you need them.
Pick one: http://en.wikipedia.org/wiki/List_of_all_two-letter_combinations
To check the availability of command names, I suggest looking for all two-letter filenames that are in the directories in your path. You can use a script like this
for item in `echo $PATH | sed 's/:/ /g'` ; do
ls -1d $item/??
done
It won't show builtins in your shell (like "do" as you mentioned) but it's a good start.
Change ?? to ??? for three-letter files, etc.
I'm going to vote for qp (quick package?) since it's easy to pronounce, easy to type, and easy to remember where the keys are on the keyboard.
I use "asd". it's short and most developers type it without thinking
(oh, and you can always claim later that it stands for some "Advanced Script for Developers" if you need to justify yourself a few years from now)
How about fu? As in Kung Fu. It's a special purpose tool. And it's really easy to type.
I think that run is a good name, at least anybody that will download your project will know what to do. Calling it without parameters should reveal your options.
Even 'do' will do, I think you can use backquotes to run it from bash scripts.
Also remember that running the tools without parameters will tell you what options you have.
Use makefiles to do everything for you.
How about calling it something descriptive, like 'build_runner', and then just aliasing it to 'br' (or preferred acronym) in your .bashrc?
There is a really crappy tool called cleartool (part of clearcase), and people will alias it on their machine to "ct". Perhaps you can have a longer command and suggest users alias it.
It would probably be best to do something like ire_and_curses suggested, name it descriptively then alias it to a 2 letter command. If I was choosing, I would name it dev_help and alias it to dh.
I think you're worrying about nothing. Install the program as 'the-command-to-do-evertyhing-and-if-you-dont-make-your-own-alias-for-it-you-should'. I don't think that will be too long for any modern filesystems, but you might need to shorten it to 'tctdeaiydmyoafiys'. See what common aliases are used, and then change the program's name to that. In other words: don't decide, let natural selection decide for you. If you are working with a team of < 10, this should not even remotely cause any problems.
Call it devtool alias to dt
Custom tools like that I like to start with the prefix 'jj-'. I can type (with big index-finger power) 'jj ' and see all my personal commands. Also, they group together in alphabetical lists. 'J' is not a very common character for built-inc commands, but you can pick your own.
Since you want two characters, you can use just 'zz', or something starting with 'z'.
Are you sure you want to put all your functionality in one command? That might be simultaneously over-constraining and over-loading the interface a little.
do conf
do build
do install
do publish