Hi,
I've just completed DO447 ! It's a great course that I definitely recommand to any ansible user :)
Here is my review, as I did for DO425. I practiced with the Online Training only (not the Video Classroom).
Overall, the main problem of this course is that the students have to do all of the labs in order, not skipping any, and not reseting any labs' VM ever.
Indeed, many labs rely on the existence of Ansible Tower objects created in previous labs (mostly in chapters 7, 8 and 9). For example, lab 9.5 relies on:
Likewise, the labs' git repository content is occasionally not reset nor even deployed by the corresponding lab start script. For example, while re-trying guided exercise 9.4, I realized that my git content on the utility VM had not been reset to its original state by the lab start script and was still containing commits I had previously pushed. I proceeded with the reset of my utility VM, only to realize that the lab start script did not even deploy any git content at all for that lab. After investigating further, I found the precious command deploying the git content I needed for that guided exercise 9.4. It was the lab start script of guided exercise... 1.4 ! (lab development-git start).
Similarly, "14.4 Guided Exercise: Configuring TLS/SSL for Ansible Tower" relies on the IdM installed at "6.4 Guided Exercise: Installing Red Hat Ansible Tower".
When designing a new course, you should assume that students often skip chapters they're familiar with, or occasionally need to reset some lab's VMs to re-try certain labs from scratch. Therefore, the lab start scripts should verify/create all of the requirements in terms of Tower objects, git content or any third-party services. About Tower objects, most required tower-cli commands are already present in the labs' lab_solve functions. An other option would be to perform a Tower restore in the labs' start scripts. About the git content, the utility VM should embed all labs' git content in the first place, and each lab start script should make a hard reset of the current lab's git repository back to its initial commit.
The second main problem of this course is that some labs' grading scripts seem to inspect and assess the student's yaml code and not the result of its execution. For example (in lab 15.3), using any_errors_fatal: true instead of max_fail_percentage: 0 is interpreted by the grading script as a failure, whereas it's a valid implementation.
Occasionally, the gitlab service is not available even 15 minutes after lab startup (HTTP 500) even though both gitlab services are up & running on services. Fixed by running :
[root@utility ~]# systemctl restart gitlab-runner
[root@utility ~]# systemctl restart gitlab-runsvdir
Finally, my lab environment was able to access the internet, which is unusual for a Red Hat lab. Just to let you know :
1.1 Implementing Recommended Practices
register: example_webpage
failed_when: example_webpage.status != 200
-> I do not consider that a best-practice, since the uri module comes with a status_code attribute. You can probably find a better example here, like looking for a text pattern in the web page.
1.3 Managing Ansible Project Materials Using Git
"Bare repositories does not have a local working tree."
"In the preceding example, the most recent commit for the branch master (and HEAD at that time) was commit 5749661, which occurred at some point in the past. A user ran the git branch feature/1 command, creating a branch, feature/1."
-> the commit is actually 7900dd94
2.1 Writing YAML Inventory Files
"These servers themselves form their own groups, so they must end in a colon"
-> the reason why hosts definitions must end in a colon is because they are YAML dictionaries containing their own host's variables (if any), not because they form their own groups (or did you mean "their own blocks" ?)
all:
children:
ungrouped:
notinagroup.lab.example.com:
mailserver:
mail.lab.example.com:
-> the hosts keys are missing here
2.5 Lab: Managing Inventories
-> At item 5, it is unclear that the requested numbered naming scheme has to be static and not dynamic.
3.2 Guided Exercise: Controlling Privilege Escalation
force_handlers: True
-> needless here (there is really no reason any task would fail) ; above all, it misleadingly lets the students think that handlers are going to be executed every time, which is a false statement ; I'd rather insert changed_when: true on the 'Ensure haproxy configuration is set' task of role 'haproxy' instead
4.3 Templating External Data using Lookups
"Note that this example may not the most efficient way to do this particular task"
4.6 Guided Exercise: Implementing Advanced Loops
-> At this step there is no IDM or IPA service actually running on utility.lab.example.com. It's installed later at guided exercise 6.4. Therefore the 'ipa_user' module fails in scenario 1. So does the lab data-loops script.
5.4 Guided Exercise: Managing Rolling Updates
"After the playbook deploys the web application, a smoke test ensures that each back-end web server is responds with a 200 HTTP status code."
5.6 Summary
-> 2 orphan closing parenthesis on this page
6.6 Guided Exercise: Accessing Red Hat Ansible Tower
"Review the output of the job execution to determine which tasks were executed. You should see that the msg module was used to successfully display a Hello World! message."
-> the module name is actually debug
7.3 Managing Users Efficiently with Teams
"(instead of read on individual Teams."
-> Closing parenthesis missing
9.5 Lab: Managing Projects and Launching Ansible Jobs
-> The lab's grading script checks that the Developers team has a use role on the Test inventory, which is not required nor a lab objective.
10.10 Summary
"Ansible Tower provides a browsable REST API that can easily be used to automate Ansible Tower operations and integrate it with third-party products."
-> misplaced and repeats 11.6 Summary
11.4 Guided Exercise: Interacting with APIs using Ansible Playbooks
-> In the first playbook 'tower_copy_template.yml', registering the first 'uri' call to grab the inventory id is not only needless but also confusing. Indeed, the retrieved value 'copy.json.inventory' is already an attribute of the newly copied template, not one of the original one.
12.2 Guided Exercise: Importing External Static Inventories
(Step 4.7) "Click the double-arrow icon in the row for the git-inventory source to retrieve the changes. Wait until the cloud icon next to git-inventory is static and green."
-> this step is not needed because we checked the box : UPDATE ON PROJECT CHANGE
12.5 Filtering Hosts with Smart Inventories
"but that is not not the case"
12.6 Guided Exercise: Filtering Hosts with Smart Inventories
"These two systems' facts are available in Ansible Tower's cache because in a previous exercise we executed a job on those managed hosts with a job template that had fact caching enabled."
-> again, it would help a lot to specify which lab at least (it's actually guided exercise 10.2), or even better: leverage on the lab start script to enforce that
14.4 Guided Exercise: Configuring TLS/SSL for Ansible Tower
[root@tower ~]# semanage fcontext -a -t cert_t "/etc/tower(/.*)?"
-> that pretty loose pattern matches all Tower configuration files, most of which are unrelated to certificates
-> Unfortunately, I have never been able to complete this guided exercise, probably because of a previous reset of my utility VM. Even after having run 'lab tower-install start' (from guided exercise 6.4) to re-install IdM, I still missed the certmonger package on my tower VM. After having installed that package manually, I still missed the proper Kerberos configuration. I gave up at that point to save up some of my scarce lab time. So sad that the lab start script does not take care of all that.
15. Comprehensive review
-> Labs' solution are not hidden to make comprehensive review a mock exam :(
15.3 Lab: Privilege Escalation, Lookups, and Rolling Updates
"If a single host fails to update, the playbook must stop executing immediately."
-> as said before, any_errors_fatal: true should be a valid answer here, but it is not, only max_fail_percentage: 0 is accepted as a valid solution
(step 4) "Introduce logging tasks to register the start and end of the deployment on your control node."
-> Using lineinfile and delegate_to: localhost ends up with concurrent writing operations on the controller, leading to some lines missing, as explained here. Even though that buggy behaviour is more or less mitigated by the batch updates set up at the next step, it is not a very good practice.
15.9: Lab: Testing the Prepared Environment
-> This lab is broken because of improper residuing content in /var/lib/mysql/ on servere, preventing mariadb to start up on that server. That residuing content comes from the scenario 2 of lab "4.6 - Guided Exercise: Implementing Advanced Loops", where mysql-server (and not mariadb-server) was installed.
Workaround:
[root@servere ~]# rm -Rf /var/lib/mysql/*
+ relaunch the Full Stack Deployment workflow Job Template
I may have skipped a lab that cleans up that content at some point during the course.
To add to this: It is *impossible* to get past the "Configuring Job Templates" section in the comprehensive review because the SSH key for cloning the git project doesn't work.
Red Hat
Learning Community
A collaborative learning environment, enabling open source skill development.