wget issue with certain html files - wget

I wrote a little wget command to download a page + all links that belong to the same directory, but only one level deep.
wget -Ekpx -np -l 1 -D <domain> <page>
The flags:
E: adjust-extension
k: convert-links
p: page-requisites
r: recursive
np: no-parent
l: level
D: domains
This works great for a lot of sites, but I had issues with this site in particular.
wget -Ekpr -np -l 1 -D www.eurocanadians.ca https://www.eurocanadians.ca/2022/02/the-origins-of-the-personal-computer-and-what-you-can-do-about-it.html
It treats the html file as a directory:
└── www.eurocanadians.ca
├── 2022
│   └── 02
│   └── the-origins-of-the-personal-computer-and-what-you-can-do-about-it.html
│   └── feed
When I replace the -r flag with -x (--force-directories), I get this result:
└── www.eurocanadians.ca
├── 2022
│   └── 02
│   └── the-origins-of-the-personal-computer-and-what-you-can-do-about-it.html
It treats it as a html file (as it should), but it's a single page download and doesn't download all the first level links.
How to let the page be treated as html, but still use -r? -F (--force-html) didn't work btw.

Related

Using Dist::Zilla dist.ini how can I have files that I only use for testing?

In a Dist::Zilla-based distribution I would like to have some files that are only used for testing, but do not get installed. These are mockup libs that aren't needed for runtime.
How do I do that?
CPAN distributions never install the t and xt directories. You can put your tests and your mock libs into t.
As an example, take my module MooseX::LocalAttribute. In the dist, there is a t/, a t/lib and an xt/.
If you install this using cpanm -l into a local lib dir, you will see there are no tests installed. This happens automatically. It's just how CPAN works.
$ cpanm -l mylib MooseX::LocalAttribute
--> Working on MooseX::LocalAttribute
Fetching http://www.cpan.org/authors/id/S/SI/SIMBABQUE/MooseX-LocalAttribute-0.05.tar.gz ... OK
Configuring MooseX-LocalAttribute-0.05 ... OK
Building and testing MooseX-LocalAttribute-0.05 ... OK
Successfully installed MooseX-LocalAttribute-0.05
1 distribution installed
$ tree mylib
mylib
├── lib
│   └── perl5
│   ├── MooseX
│   │   └── LocalAttribute.pm
│   └── x86_64-linux
│   ├── auto
│   │   └── MooseX
│   │   └── LocalAttribute
│   └── perllocal.pod
└── man
└── man3
└── MooseX::LocalAttribute.3
9 directories, 3 files
Note that as long as stuff is in t/lib (or anywhere under t/, really), you do not have to hide the package names from the PAUSE indexer. It's smart enough to not find it.
I misunderstood the question. This answer is for the following question:
How do I exclude files from a Dist::Zilla based distribution so they don't get shipped at all?
You are probably using either the GatherDir or Git::GatherDir plugin to build your bundle. Both of them have an option exclude_filename that you can set in your dist.ini to not include a file in a bundle.
A common pattern is to exclude auto-generated files such as LICENSE or META.json, and then add them later with another plugin. But you don't have to do that, you can just exclude files completely.
A good example is the URI distribution. On metacpan, it does not include any text files in the bundle. But if you look at the repository on github, you can see there are various .txt files such as rfc2396.txt. The dist.ini contains the following lines.
[Git::GatherDir]
exclude_filename = LICENSE
exclude_filename = README.md
exclude_filename = draft-duerst-iri-bis.txt
exclude_filename = rfc2396.txt
exclude_filename = rfc3986.txt
exclude_filename = rfc3987.txt
As mentioned before, the LICENSE and README.md files will still appear in the final bundle, because they get added later via #Git::VersionManager.

Links to json files

my directory structure is
├── xxx
│   ├── 01.md
| └── 02.md
├── auth
│   ├── j1.json
│   ├── j2.json
│   └── j3.json
└── default.template.html
And I link jsons from markdowns like Auth. It makes sense as we use there files as test scenarios and in json files we have credentials and roles. But if I try to generate html it fails on unresolved internal reference: ../auth/aspect_admin.json. I tried to exclude the link checking but without any help. The best would be to leave it as a link in md file but somehow follow the link and include the json as code block in generated html. Is it possible?
It was a bug and will be fixed in next version https://github.com/planet42/Laika/issues/148

Yocto: cp can't stat file: no such file or directory

I am trying to copy two folders(containing some scripts) in my target rootfs. I have created a custom layer and a custom recipe inside it.
My directory structure is like this:
../sources/meta-company/recipes-bla_2.06/
└── bla
├── bla
│   ├── dir1
│   │   ├── dir
│   │   │   └── files.sh
│   └── dir2
│   ├── dir
│   │   ├── files.sql
│   ├── test.sh
└── bla_2.06.bb
My .bb file is as follows:
DESCRIPTION = " bla "
LICENSE = "CLOSED"
SRC_URI = "file://dir1/ \
file://dir2/ "
do_install() {
install -d ${D}/root/dir1
install -d ${D}/root/dir2
cp -r --no-dereference --preserve=mode,links -v ${S}/dir1/ ${D}/root/dir1
cp -r --no-dereference --preserve=mode,links -v ${S}/dir2/ ${D}/root/dir2/
}
FILE_$PN = "/root/"
The error I am getting:
> Log data follows: | DEBUG: Executing shell function do_install | cp:
> cannot stat
> '/home/amol/test/fsl-arm-yocto-bsp/build-cl-som-imx7-fsl-imx-x11/tmp/work/cortexa7hf-neon-poky-linux-gnueabi/bla/1.0-r0/bla-1.0/dir1':
> No such file or directory | WARNING: exit code 1 from a shell command.
> | ERROR: Function failed: do_install (log file is located at
> /home/amol/test/fsl-arm-yocto-bsp/build-cl-som-imx7-fsl-imx-x11/tmp/work/cortexa7hf-neon-poky-linux-gnueabi/seriald/1.0-r0/temp/log.do_install.49808)
> NOTE: recipe bla-1.0-r0: task do_install: Failed NOTE: Tasks Summary:
> Attempted 334 tasks of which 333 didn't need to be rerun and 1 failed.
I am new to yocto, is my .bb file correct?.Thanks in advance.
There are two problems in your do_install section,
${S} points to source directory, but SRC_URI copies your content in ${WORKDIR}. So you should be using ${WORKSIR} in your install section
You are trying to copy ${S}/dir1/ inside ${D}/root/dir1, this means your final structure is /root/dir1/dir1/. You may not want this.
So the modified version would look like,
do_install() {
install -d ${D}/root/dir1
install -d ${D}/root/dir2
cp -r --no-dereference --preserve=mode,links -v ${WORKDIR}/dir1/* ${D}/root/dir1/
cp -r --no-dereference --preserve=mode,links -v ${WORKDIR}/dir2/* ${D}/root/dir2/
}

Locally building and pushing VuePress site to Github Pages

Having trouble figuring out how the workflow for using Github as a VuePress site source control and deploying it to Github Pages.
When I ran deploy.sh the first time, it gave me a Github certificate error around the init command and did not initialize a new repo (I already have a repo setup so not sure if the init command in deploy.sh is required. Subsequent runs of deploy.sh resulted in no error.
**Problem:**Unfortunately, when I visit my Github Pages site, its not using VuePress templates.
I feel like I have either:
- The folder structure wrong
- The base set incorrectly in config.js
- The relative folders incorrect in deploy.sh
Can someone put eyes on this and give some feedback? Thank you.
For your reference
Local machine's folder structure:
user#system:~/powerDocs$ tree
.
├── deploy.sh
├── docs
│   └── README.md
├── node_modules
│   └── yarn
│   ├── bin
│   │   ├── yarn
│   │   ├── yarn.cmd
│   │   ├── yarn.js
│   │   ├── yarnpkg
│   │   └── yarnpkg.cmd
│   ├── lib
│   │   ├── cli.js
│   │   └── v8-compile-cache.js
│   ├── LICENSE
│   ├── package.json
│   └── README.md
├── package.json
├── package-lock.json
└── README.md
5 directories, 15 files
Content of deploy.sh:
#!/usr/bin/env sh
# abort on errors
set -e
# build
vuepress build
# navigate into the build output directory
cd docs/.vuepress/dist
# if you are deploying to a custom domain
# echo 'www.example.com' > CNAME
git init
git add -A
git commit -m 'deploy'
# if you are deploying to https://<USERNAME>.github.io
# git push -f git#github.com:SeaDude/SeaDude.github.io.git master
# if you are deploying to https://<USERNAME>.github.io/<REPO>
git push -f git#github.com:SeaDude/powerDocs.git master:gh-pages
cd -
I made deploy.sh executable with chmod +x deploy.sh. Running ./deploy.sh gives me the following output:
user#system:~/powerDocs$ ./deploy.sh
WAIT Extracting site metadata...
[12:05:53 PM] Compiling Client
[12:05:53 PM] Compiling Server
(node:15590) DeprecationWarning: Tapable.plugin is deprecated. Use new API on `.hooks` instead
[12:05:57 PM] Compiled Server in 3s
[12:05:59 PM] Compiled Client in 6s
WAIT Rendering static HTML...
DONE Success! Generated static files in .vuepress/dist.
Reinitialized existing Git repository in /home/powerDocs/docs/.vuepress/dist/.git/
On branch master
nothing to commit, working directory clean
Here is the contents of config.js:
module.exports = {
title: "PowerDocs",
description: "Where functions go to frolic.",
base: "/powerDocs/",
themeConfig: {
nav: [
{ text: "Home", link: "/" }
],
sidebar: [
'/'
]
}
};
Have you checked your dist folder to see what is actually being output? The error makes it seem like there are no files present to commit after the build.
I have the almost identical setup locally and haven't run into this problem with it, the only difference being the command I run to build is yarn docs:build

inheritance-diagrams in sphinx for matlab

I am documenting a matlab code that I have with sphinx. I am using the package sphinxcontrib-matlabdomain.
My directory tree is as follows:
me:~/.../doc$ tree ../
../
├── doc
│   ├── conf.py
│   ├── make.bat
│   ├── Makefile
│   ├── index.rst
│   ├── BaseClass.rst
│   └── DerivedClass.rst
├── LICENSE.md
├── README.md
└── src
├── BaseClass.m
└── DerivedClass.m
The problem comes when I want to show inheritance diagrams. I have added the necessary things in my config.py file:
matlab_src_dir = os.path.abspath('..')
extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.inheritance_diagram',
'sphinx.ext.graphviz',
'sphinx.ext.viewcode',
'sphinxcontrib.matlab',
]
primary_domain = 'mat'
And I have the following in the index.rst file
Welcome to BGK's documentation!
===============================
I am trying to have a diagram here...
.. inheritance-diagram:: BaseClass DerivedClass
:parts:2
.. graphviz::
digraph {
"From here" -> "To" -> "Somewhere";
"From here" -> "To" -> "Somewhere else";
}
And in the output the directive inheritance-diagram is ignored, obtaining directly the next diagram that I am using to test that I can plot diagrams.
Is there any incompatibility to plot inheritance diagrams with sphinx for matlab classes? Is there any way to go around the problem? Thanks!
Sphinx does not support this. The built-in sphinx.ext.inheritance_diagram extension is for the Python domain only. It does not work for Matlab. If it did, I'm sure it would say so in the Sphinx documentation (and a glance at the source code in sphinx/ext/inheritance_diagram.py confirms that it is only for Python).
The only way inheritance diagrams for Matlab could work is if some other extension provided the functionality. The sphinxcontrib-matlabdomain extension that you use does not.