I'm trying to get a publication on JOSS (journal open source software) and they require the paper written in markdown on github. I'm struggling in understanding how i can add the citation. So I included a file named paper.bib in my github main folder. In the Readme.md i wrote
---
title: 'CREDO: a friendly Customizable, REproducible, DOcker file generator'
tags:
- Docker
- Reproducibility
- Docker generator
- User Iinterface
authors:
- name: Simone Alessandri'
equal-contrib: 1
affiliation: 1
- name: Rabellino Sergio
equal-contrib: 2
affiliation: 2
- name: Sandro Contaldo
equal-contrib: 3
affiliation: 2
- name: Maria Ratto
equal-contrib: 3
affiliation: 4
- name: Gabriele Piacenti
equal-contrib: 3
affiliation: 5
- name: Qi Wang
equal-contrib: 3
affiliation: 3
- name: Marco Beccuti
equal-contrib: 4
affiliation: 2
- name: Raffaele Adolfo Calogero
equal-contrib: 4
affiliation: 4
- name: Luca Alessandri
equal-contrib: 5
affiliation: "3,4"
- name: Author with no affiliation
corresponding: true
affiliation: 3
affiliations:
- name: Politechnic of Turin, Torino, Italy
index: 1
- name: Department of Computer Science, University of Torino, Torino
index: 2
- name: Department of Pathology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
index: 3
- name: Department of Molecular Biotechnology and Health Sciences, University of Torino, Torino
index: 4
- name: Molecular Biotechnology Center & Department of Life Sciences and Systems Biology, University of Turin, Torino, Italy
index: 5
date: 11 July 2022
bibliography: paper.bib
aas-doi:
aas-journal: JOSS The Journal of Open Source Software
---
Is this enough to load the citations? Here is my bib file.
#inproceedings{uno,
title={Reproducible bioinformatics project: a community for reproducible bioinformatics analysis pipelines},
author={N. Kulkarni , L. Alessandri, R. Panero, M. Arigoni, M. Olivero, G. Ferrero, et al},
booktitle={BMC Bioinformatic},
pages={vol. 19 Suppl 10:349, 2018, doi:10.1186/s12859-018-2296-x},
doi={10.1186/s12859-018-2296-x}
}
#inproceedings{due,
title={https://docs.docker.com/engine/}
}
#inproceedings{tre,
title={Containers in Bioinformatics: Applications, Practical Considerations, and Best Practices in Molecular Pathology},
author={S. Kadri, A. Sboner, A. Sigaras and S. Roy},
booktitle={J Mol Diagn., 2022},
doi={10.1016/j.jmoldx.2022.01.006}
}
#inproceedings{quattro,
title={https://cran.r-project.org/}
}
#inproceedings{cinque,
title={https://www.python.org/}
}
#inproceedings{sei,
title={Using R and Bioconductor in Clinical Genomics and Transcriptomics},
author={J.L. Sepulveda. },
booktitle={J Mol Diagn vol. 22},
doi={10.1016/j.jmoldx.2019.08.006}
}
#inproceedings{sette,
title={Sparsely-connected autoencoder (SCA) for single cell RNAseq data mining},
author={L. Alessandri, F. Cordero, M. Beccuti, N. Licheri, M. Arigoni, M. Olivero, et al },
booktitle={NPJ Syst Biol Appl. vol. 7},
doi={10.1038/s41540-020-00162-6}
}
#inproceedings{otto,
title={Sparsely Connected Autoencoders: A Multi-Purpose Tool for Single Cell omics Analysis},
author={L. Alessandri, M.L. Ratto, S.G. Contaldo, M. Beccuti, F. Cordero, M. Arigoni, et al},
booktitle={nt J Mol Sci., vol. 22},
doi={10.3390/ijms222312755}
}
#inproceedings{nove,
title={rCASC: reproducible classification analysis of single-cell sequencing data},
author={L. Alessandri, F. Cordero, M. Beccuti, M. Arigoni, M. Olivero, G. Romano, et al},
booktitle={Gigascience, vol. 8},
doi={10.1093/gigascience/giz105}
}
#inproceedings{dieci,
title={https://docs.conda.io/en/latest/}
}
#inproceedings{undici,
title={https://bioconda.github.io/}
}
#inproceedings{dodici,
title={Orchestrating high-throughput genomic analysis with Bioconductor},
author={W. Huber, V.J. Carey, R. Gentleman, S. Anders, M. Carlson, B.S. Carvalho, et al},
booktitle={Nat Methods, vol. 12},
doi={10.1038/nmeth.3252}
}
#inproceedings{tredici,
title={Bioconductor: open software development for computational biology and bioinformatics},
author={R.C. Gentleman, V.J. Carey, D.M. Bates, B. Bolstad, M. Dettling, S. Dudoit, et al},
booktitle={Genome Biol., vol. 5},
doi={10.1186/gb-2004-5-10-r80}
}
#inproceedings{quattordici,
title={https://github.com/}
}
#inproceedings{quindici,
title={https://uwekorn.com/2021/03/01/deploying-conda-environments-in-docker-how-to-do-it-right.html}
}
#inproceedings{sedici,
title={https://pythonspeed.com/articles/activate-conda-dockerfile/}
}
#inproceedings{diciassette,
title={https://biocontainers.pro/}
}
In the text, how can i cite the first paper? I tried \cite{uno} as suggested from other questions but is not working. Here is the link to the repository https://github.com/alessandriLuca/CREDO_paper
In general, in pandoc's markdown, which JOSS uses, you can cite by doing [#bibtex-key], eg. in your first case, [#uno]. Here is the documentation regarding citations with pandoc's markdown.
The complete setup is demonstrated in the JOSS documentation: "Example paper and Bibliography".
Another way to see examples would be to look at how other papers on JOSS do it: see this paper which is the first one I find on JOSS. You can see there the [#bibtex-key] syntax.
Related
data = {'desc': ['ADRIAN PETER - ANN 80020355787C - 11 Baillon Pass.pdf', 'AILEEN MARCUS - ANC 800E15432922 - 5 Mandarin Way.pdf',
'AJITH SINGH - ANN 80020837750 - 11 Berkeley Loop.pdf', 'ALEX MARTIN-CURTIS - ANC 80021710355 - 26 Dovedale St.pdf',
'Alice.Smith\Jodee - Karen - ANE 80020428377 - 58 Harrisdale Dr.pdf']}
df = pd.DataFrame(data, columns = ['desc'])
df
From the data frame, I want to create a new column called ID, and in that ID, I want to have only those values starting after ANN, ANC or ANE. So I am expecting a result as below.
ID
80020355787C
800E15432922
80020837750
80021710355
80020428377
I tried running the code below, but it did not get the desired result. Appreciate your help on this.
df['id'] = df['desc'].str.extract(r'\-([^|]+)\-')
You can use - AN[NCE] (800[0-9A-Z]+) -, where:
AN[NCE] matches literally AN followed by N or C or E;
800[0-9A-Z]+ matches literally 800 followed by one or more characters between 0 and 9 or between A and Z.
>>> df['desc'].str.extract(r'- AN[NCE] (800[0-9A-Z]+) -')
0
0 80020355787C
1 800E15432922
2 80020837750
3 80021710355
4 80020428377
If not all your ids start with "800", you can just remove it from the pattern.
I was trying to output a Word document using APA bibliography style. When I compiled my document in order to output a PDF, there was no problem at all: citations were listed correctly in the references section. It was the same with HTML output. On the other hand, when I compiled my document using Word output, my references were not listed correctly... It was my debut. This is a correct version of my MWE:
This is a reference bib I use:
#Book{Assoun1981,
author = {Assoun, Paul-Laurent},
title = {Introduction à l'épistémologie freudienne},
publisher = {Éditions Payot},
address = {Paris},
year = {1981}, }
And this is my MWE:
---
title: "A title"
output:
pdf_document:
latex_engine: xelatex
citation_package: biblatex
toc: yes
toc_depth: 4
number_sections: yes
html_document:
toc: yes
toc_depth: 4
df_print: paged
word_document:
toc: yes
toc_depth: 4
date: "August 2022"
bibliography: YourLibraryName.bib
fontsize: 12pt
geometry: left=4cm,right=4cm,top=4cm,bottom=4cm
linestretch: 1.5
toc-title: Plan
links-as-notes: yes
link-citations: yes
header-includes:
- \usepackage{fontspec}
- \setmainfont[Numbers=OldStyle,Mapping=tex-text]{Janson Text LT Std}
- \usepackage{fancyhdr}
- \pagestyle{headings}
- \fancyfoot[LE,RO]{\thepage}
- \usepackage [french]{babel}
- \usepackage[backend=biber,style=apa]{biblatex}
- \DeclareLanguageMapping{french}{french-apa}
- \DefineBibliographyExtras{french}{\restorecommand\mkbibnamefamily}
---
# Introduction
[#Assoun1981]
# Références
I edited my MWE and added a correct version.
I have a MongoDB instance which contains a translation of texts:
{
"_id" : ObjectId("57c68ba415f4d42b6ecd9ee7"),
"en" : "Adana (pronounced [aˈda.na]) is a major city in southern Turkey. The city is situated on the Seyhan river, 35 km (22 mi) inland from the Mediterranean Sea, in south-central Anatolia. It is the administrative seat of the Adana Province and has a population of 1.7 million,[1] making it the fifth most populous city in Turkey. Adana-Mersin polycentric metropolitan area, with a population of 3 million, stretches over 70 km (43 mi) east-west and 25 km (16 mi) north-south; encompassing the cities of Mersin, Tarsus and Adana.",
"sw" : "Adana (Kigiriki Άδανα) ni mji mkubwa katika nchi ya Uturuki. Kwa mujibu wa sensa iliyofanyika mwaka wa 2000, mji una wakazi wapatao 1,130,710 waishio huko,[2] na kuufanya kuwa mmoja kati ya miji mitano mikubwa ya Uturuku (baada ya Istanbul, Ankara, İzmir na Bursa). Mwaka wa 2006 mji wa Adana umekadiriwa kufikia iadadi ya wakazi wapatao 1,271,894. Huu ndiyo mji mkuu wa Mkoa wa Adana."
}
{
"_id" : ObjectId("57c68ba915f4d42b6ecd9eea"),
"en" : "Addis Ababa or Addis Abeba (the spelling used by the official Ethiopian Mapping Authority),(Amharic: አዲስ አበባ? Addis Abäba IPA: [adˈdis ˈabəba] ( listen), \"new flower\"; Oromo: Finfinne,[3][4] [fɪnˈfɪ́n.nɛ́] \"Natural Spring(s)\"), is the capital and largest city of Ethiopia. Finfinne is its Oromo name. It has a population of 3,384,569 according to the 2007 population census, with annual growth rate of 3.8%. This number has been increased from the originally published 2,738,248 figure and appears to be still largely underestimated.[2][5]",
"sw" : "Addis Ababa (pia Addis Abeba; kwa Kiamhara አዲስ አበባ, \"Ua Jipya\"; kwa Kioromo Finfinne) ni mji mkuu wa Ethiopia na wa Umoja wa Afrika."
}
{
"_id" : ObjectId("57c68bab15f4d42b6ecd9eec"),
"en" : "Adelaide of Italy (931 – 16 December 999), also called Adelaide of Burgundy, was the second wife of Holy Roman Emperor Otto the Great[2] and was crowned as the Holy Roman Empress with him by Pope John XII in Rome on February 2, 962. Empress Adelaide was perhaps the most prominent European woman of the 10th century; she was regent of the Holy Roman Empire as the guardian of her grandson in 991-995.[2]",
"sw" : "Adelaide wa Italia (takriban 931 – 16 Desemba, 999) alikuwa binti wa Rudolf II, mfalme wa Burgundia. Kwanza aliolewa na Lothar, mfalme wa Italia. Alipofariki Lothar, Adelaide aliolewa na Otto I, mfalme wa Ujerumani. Aliishi maisha matakatifu. Sikukuu yake ni 16 Desemba."
}
What I would like to do is to select one specific record. For example I expect to select the last record by doing this:
db.wiki.find({"sw": "Adelaide wa Italia"}).pretty();
But the mongo shell returns nothing.
Indeed, I know that I can create an index and do something like:
db.wiki.find({$text: {$search: "\"Adelaide wa Italia\""}}).pretty();
which indeed returns the record as expected.
What am I doing wrong in the non-index searching please?
In this case you should use search with regex:
db.wiki.find({"sw": /Adelaide wa Italia/}).pretty();
The way you are doing it by:
db.wiki.find({"sw": "Adelaide wa Italia"}).pretty();
you simply tell Mongo to return you all documents where sw is equal to Adelaide wa Italia but you want to get all documents which contains this phrase in sw field instead.
I tried a Script to mark the Journal using Score Condition.
W{REGEXP("Journal",true)->MARK(ONLY_Journal)};
W{REGEXP("Retraction|Retracted")->MARK(RETRACT)};
W{REGEXP("Suppl")->MARK(SUPPLY)};
NUM {->MARK(VOLUMEISSUE,1,6)}LParen NUM SPECIAL?{REGEXP("-")} NUM? RParen;
Reference{CONTAINS(ONLY_Journal)->MARKSCORE(10,JOURNAL_MAYBE)};
Reference{CONTAINS(JournalVolumeMarker)->MARKSCORE(5,JOURNAL_MAYBE)};
Reference{CONTAINS(VOLUMEISSUE)->MARKSCORE(15,JOURNAL_MAYBE)};
Reference{CONTAINS(JOURNALNAME)->MARKSCORE(10,JOURNAL_MAYBE)};
Reference{CONTAINS(RETRACT)->MARKSCORE(10,JOURNAL_MAYBE)};
Reference{CONTAINS(SUPPLY)->MARKSCORE(5,JOURNAL_MAYBE)};
JOURNAL_MAYBE{SCORE(20,55)->MARK(JOURNAL)};
Sample Text
1.Lawrence RA. A review of the medical 342–340 benefits and contraindications to breastfeeding in the United States [Internet] . Arlington (VA): National Center for Education in Maternal and Child Health; 1997 Oct [cited 2000 Apr 24]. p. 40. Available from: www.ncemch.org/pubs/PDFs/Welcometojungle.pdf.
2.Shishido A. Retraction notice: Effect of platinum compounds on murine lymphocyte mitogenesis [Retraction of Alsabti EA, Ghalib ON, Salem MH. In: Jpn J Med Biol 1979 Apr; 32(2):53-65]. Jpn J Med Sci Biol 1980 Aug;33(4):235-237.
3.Leist TP, Zinkernagel RM. Effects of treatment with IL-2 receptor specific monoclonal antibody in mice [letter] [Retraction of Leist TP, Kohler M, Eppler M, Zinkernagel RM. In: J Immunol 1989 Jul 15; 143(2): 628-32]. J Immunol 1990 Apr 1;144(7):2847.
4.Chen, L., James, N., Barker, C., Busam, K., & Marghoob, A. (2013). Desmoplastic
melanoma: A review. Journal of the American Academy of Dermatology, 68(5), 825-833.
doi: 10.1016/j.jaad.2012.10.041.
But the above script is not working.Can anyone find a solution for it.
Thanks in advance.
This should work jsut fine, but depends of course on the amount of annotations of the types existence of ONLY_Journal, JournalVolumeMarker, and so on ...
Here's the test script for a simple ruta project:
ENGINE utils.PlainTextAnnotator;
TYPESYSTEM utils.PlainTextTypeSystem;
Document{->EXEC(PlainTextAnnotator, {Paragraph})};
DECLARE Reference, ONLY_Journal, JOURNAL_MAYBE, JournalVolumeMarker, VOLUMEISSUE, JOURNALNAME, RETRACT, SUPPLY;
DECLARE JOURNAL;
Paragraph{-> Reference};
"Jpn J Med Biol" -> JOURNALNAME;
"32\\(2\\)" -> VOLUMEISSUE;
Reference{CONTAINS(ONLY_Journal)->MARKSCORE(10,JOURNAL_MAYBE)};
Reference{CONTAINS(JournalVolumeMarker)->MARKSCORE(5,JOURNAL_MAYBE)};
Reference{CONTAINS(VOLUMEISSUE)->MARKSCORE(15,JOURNAL_MAYBE)};
Reference{CONTAINS(JOURNALNAME)->MARKSCORE(10,JOURNAL_MAYBE)};
Reference{CONTAINS(RETRACT)->MARKSCORE(10,JOURNAL_MAYBE)};
Reference{CONTAINS(SUPPLY)->MARKSCORE(5,JOURNAL_MAYBE)};
JOURNAL_MAYBE{SCORE(20,55)->MARK(JOURNAL)};
... applied sample text, the second reference is annotated with JOURNAL.
DISCLAIMER: I am a develoepr of UIMA Ruta.
I tried to POS tag a sentence in Scala using Stanford parser like below
val lp:LexicalizedParser = LexicalizedParser.loadModel("edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz");
lp.setOptionFlags("-maxLength", "50", "-retainTmpSubcategories")
val s = "I love to play"
val parse :Tree = lp.apply(s)
val taggedWords = parse.taggedYield()
println(taggedWords)
I got an error type mismatch; found : java.lang.String required: java.util.List[_ <: edu.stanford.nlp.ling.HasWord] in the line val parse :Tree = lp.apply(s)
I don't know whether this is the right way of doing it or not. Are there any other easy ways of POS tagging a sentence in Scala?
You might like to consider the FACTORIE toolkit (http://github.com/factorie/factorie). It is a general library for machine learning and graphical models that happens to include an extensive suite of natural language processing components (tokenization, token normalization, morphological analysis, sentence segmentation, part-of-speech tagging, named entity recognition, dependency parsing, mention finding, coreference).
Furthermore it is written entirely in Scala, and it is released under the Apache License.
Documentation is currently sparse, but will be improving in the coming months.
For example, once Maven-based installation is finished you can type at the command line:
bin/fac nlp --pos1 --parser1 --ner1
to launch a socket-listening multi-threaded NLP server. Then query it by piping plain text to its socket number:
echo "Mr. Jones took a job at Google in New York. He and his Australian wife moved from New South Wales on 4/1/12." | nc localhost 3228
The output is then
1 1 Mr. NNP 2 nn O
2 2 Jones NNP 3 nsubj U-PER
3 3 took VBD 0 root O
4 4 a DT 5 det O
5 5 job NN 3 dobj O
6 6 at IN 3 prep O
7 7 Google NNP 6 pobj U-ORG
8 8 in IN 7 prep O
9 9 New NNP 10 nn B-LOC
10 10 York NNP 8 pobj L-LOC
11 11 . . 3 punct O
12 1 He PRP 6 nsubj O
13 2 and CC 1 cc O
14 3 his PRP$ 5 poss O
15 4 Australian JJ 5 amod U-MISC
16 5 wife NN 6 nsubj O
17 6 moved VBD 0 root O
18 7 from IN 6 prep O
19 8 New NNP 9 nn B-LOC
20 9 South NNP 10 nn I-LOC
21 10 Wales NNP 7 pobj L-LOC
22 11 on IN 6 prep O
23 12 4/1/12 NNP 11 pobj O
24 13 . . 6 punct O
Of course there is a programmatic API to all this functionality as well.
import cc.factorie._
import cc.factorie.app.nlp._
val doc = new Document("Education is the most powerful weapon which you can use to change the world.")
DocumentAnnotatorPipeline(pos.POS1).process(doc)
for (token <- doc.tokens)
println("%-10s %-5s".format(token.string, token.posLabel.categoryValue))
will output:
Education NN
is VBZ
the DT
most RBS
powerful JJ
weapon NN
which WDT
you PRP
can MD
use VB
to TO
change VB
the DT
world NN
. .
I found a very simple way to do POS tagging in Scala
Step 1
Download stanford tagger version 3.2.0 form the link below
http://nlp.stanford.edu/software/stanford-postagger-2013-06-20.zip
Step 2
Add stanford-postagger jar present in the folder to your project and also place the english-left3words-distsim.tagger file present in the models folder in your project
Then, with the code below you can pos tag a sentence in Scala
val tagger = new MaxentTagger(
"english-left3words-distsim.tagger")
val art_con = "My name is Rahul"
val tagged = tagger.tagString(art_con)
println(tagged)
Output: My_PRP$ name_NN is_VBZ Rahul_NNP
I believe the API of the Stanford Parser has changed somewhat, as it does sometimes. apply has the signature, public Tree apply(java.util.List<? extends HasWord> words), and this is what you see in the error message.
What you should use now is parse, which has the signature public Tree parse(java.lang.String sentence).