Re: [DO447] Typos, errors and bugs

littlebigfab · ‎09-12-2019

Hi,

I've just completed DO447 ! It's a great course that I definitely recommand to any ansible user :)

Here is my review, as I did for DO425. I practiced with the Online Training only (not the Video Classroom).

Overall, the main problem of this course is that the students have to do all of the labs in order, not skipping any, and not reseting any labs' VM ever.

Indeed, many labs rely on the existence of Ansible Tower objects created in previous labs (mostly in chapters 7, 8 and 9). For example, lab 9.5 relies on:

the Developers team, created at guided exercise 7.4
the Test inventory, created by the lab start script of guided exercise 8.2
the Operations machine credential, created at guided exercise 8.4
the student-git SCM credential, created at guided exercise 9.2

Likewise, the labs' git repository content is occasionally not reset nor even deployed by the corresponding lab start script. For example, while re-trying guided exercise 9.4, I realized that my git content on the utility VM had not been reset to its original state by the lab start script and was still containing commits I had previously pushed. I proceeded with the reset of my utility VM, only to realize that the lab start script did not even deploy any git content at all for that lab. After investigating further, I found the precious command deploying the git content I needed for that guided exercise 9.4. It was the lab start script of guided exercise... 1.4 ! (lab development-git start).

Similarly, "14.4 Guided Exercise: Configuring TLS/SSL for Ansible Tower" relies on the IdM installed at "6.4 Guided Exercise: Installing Red Hat Ansible Tower".

When designing a new course, you should assume that students often skip chapters they're familiar with, or occasionally need to reset some lab's VMs to re-try certain labs from scratch. Therefore, the lab start scripts should verify/create all of the requirements in terms of Tower objects, git content or any third-party services. About Tower objects, most required tower-cli commands are already present in the labs' lab_solve functions. An other option would be to perform a Tower restore in the labs' start scripts. About the git content, the utility VM should embed all labs' git content in the first place, and each lab start script should make a hard reset of the current lab's git repository back to its initial commit.

The second main problem of this course is that some labs' grading scripts seem to inspect and assess the student's yaml code and not the result of its execution. For example (in lab 15.3), using any_errors_fatal: true instead of max_fail_percentage: 0 is interpreted by the grading script as a failure, whereas it's a valid implementation.

Occasionally, the gitlab service is not available even 15 minutes after lab startup (HTTP 500) even though both gitlab services are up & running on services. Fixed by running :

[root@utility ~]# systemctl restart gitlab-runner
[root@utility ~]# systemctl restart gitlab-runsvdir

Finally, my lab environment was able to access the internet, which is unusual for a Red Hat lab. Just to let you know :

1.1 Implementing Recommended Practices

register: example_webpage
failed_when: example_webpage.status != 200

-> I do not consider that a best-practice, since the uri module comes with a status_code attribute. You can probably find a better example here, like looking for a text pattern in the web page.

1.3 Managing Ansible Project Materials Using Git

"Bare repositories does not have a local working tree."

"In the preceding example, the most recent commit for the branch master (and HEAD at that time) was commit 5749661, which occurred at some point in the past. A user ran the git branch feature/1 command, creating a branch, feature/1."

-> the commit is actually 7900dd94

2.1 Writing YAML Inventory Files

"These servers themselves form their own groups, so they must end in a colon"

-> the reason why hosts definitions must end in a colon is because they are YAML dictionaries containing their own host's variables (if any), not because they form their own groups (or did you mean "their own blocks" ?)

all:
  children:
    ungrouped:
      notinagroup.lab.example.com:
    mailserver:
      mail.lab.example.com:

-> the hosts keys are missing here

2.5 Lab: Managing Inventories

-> At item 5, it is unclear that the requested numbered naming scheme has to be static and not dynamic.

3.2 Guided Exercise: Controlling Privilege Escalation

force_handlers: True

-> needless here (there is really no reason any task would fail) ; above all, it misleadingly lets the students think that handlers are going to be executed every time, which is a false statement ; I'd rather insert changed_when: true on the 'Ensure haproxy configuration is set' task of role 'haproxy' instead

4.3 Templating External Data using Lookups

"Note that this example may not the most efficient way to do this particular task"

4.6 Guided Exercise: Implementing Advanced Loops

-> At this step there is no IDM or IPA service actually running on utility.lab.example.com. It's installed later at guided exercise 6.4. Therefore the 'ipa_user' module fails in scenario 1. So does the lab data-loops script.

5.4 Guided Exercise: Managing Rolling Updates

"After the playbook deploys the web application, a smoke test ensures that each back-end web server is responds with a 200 HTTP status code."

5.6 Summary

-> 2 orphan closing parenthesis on this page

6.6 Guided Exercise: Accessing Red Hat Ansible Tower

"Review the output of the job execution to determine which tasks were executed. You should see that the msg module was used to successfully display a Hello World! message."

-> the module name is actually debug

7.3 Managing Users Efficiently with Teams

"(instead of read on individual Teams."

-> Closing parenthesis missing

9.5 Lab: Managing Projects and Launching Ansible Jobs

-> The lab's grading script checks that the Developers team has a use role on the Test inventory, which is not required nor a lab objective.

10.10 Summary

"Ansible Tower provides a browsable REST API that can easily be used to automate Ansible Tower operations and integrate it with third-party products."

-> misplaced and repeats 11.6 Summary

11.4 Guided Exercise: Interacting with APIs using Ansible Playbooks

-> In the first playbook 'tower_copy_template.yml', registering the first 'uri' call to grab the inventory id is not only needless but also confusing. Indeed, the retrieved value 'copy.json.inventory' is already an attribute of the newly copied template, not one of the original one.

12.2 Guided Exercise: Importing External Static Inventories

(Step 4.7) "Click the double-arrow icon in the row for the git-inventory source to retrieve the changes. Wait until the cloud icon next to git-inventory is static and green."

-> this step is not needed because we checked the box : UPDATE ON PROJECT CHANGE

12.5 Filtering Hosts with Smart Inventories

"but that is not not the case"

12.6 Guided Exercise: Filtering Hosts with Smart Inventories

"These two systems' facts are available in Ansible Tower's cache because in a previous exercise we executed a job on those managed hosts with a job template that had fact caching enabled."

-> again, it would help a lot to specify which lab at least (it's actually guided exercise 10.2), or even better: leverage on the lab start script to enforce that

14.4 Guided Exercise: Configuring TLS/SSL for Ansible Tower

[root@tower ~]# semanage fcontext -a -t cert_t "/etc/tower(/.*)?"

-> that pretty loose pattern matches all Tower configuration files, most of which are unrelated to certificates

-> Unfortunately, I have never been able to complete this guided exercise, probably because of a previous reset of my utility VM. Even after having run 'lab tower-install start' (from guided exercise 6.4) to re-install IdM, I still missed the certmonger package on my tower VM. After having installed that package manually, I still missed the proper Kerberos configuration. I gave up at that point to save up some of my scarce lab time. So sad that the lab start script does not take care of all that.

15. Comprehensive review

-> Labs' solution are not hidden to make comprehensive review a mock exam :(

15.3 Lab: Privilege Escalation, Lookups, and Rolling Updates

"If a single host fails to update, the playbook must stop executing immediately."

-> as said before, any_errors_fatal: true should be a valid answer here, but it is not, only max_fail_percentage: 0 is accepted as a valid solution

(step 4) "Introduce logging tasks to register the start and end of the deployment on your control node."

-> Using lineinfile and delegate_to: localhost ends up with concurrent writing operations on the controller, leading to some lines missing, as explained here. Even though that buggy behaviour is more or less mitigated by the batch updates set up at the next step, it is not a very good practice.

15.9: Lab: Testing the Prepared Environment

-> This lab is broken because of improper residuing content in /var/lib/mysql/ on servere, preventing mariadb to start up on that server. That residuing content comes from the scenario 2 of lab "4.6 - Guided Exercise: Implementing Advanced Loops", where mysql-server (and not mariadb-server) was installed.

Workaround:

[root@servere ~]# rm -Rf /var/lib/mysql/*

+ relaunch the Full Stack Deployment workflow Job Template

I may have skipped a lab that cleans up that content at some point during the course.

Razique · ‎09-12-2019

@littlebigfab first, thank you so much for taking the time to provide such a thorough feedback of the DO447 course. I am glad that you enjoyed the course. This is a great course.

I have contacted the team to let them know about all these issues and typos. They will come up with a resolution plan and determine their next course of action.

-razique

daniel-deptula · ‎02-07-2020

Hi Razique,

I have exactly the same feelings as "littlebigfab" when going through this course. The number of mistakes / typos is extraordinary and the fact that the scripts for lab setup don't work properly or have some undocumented dependencies is frustrating.

It's a shame Red Hat that it is a paid course and it's so bad quality, like no one has ever tested it.

BTW, I'm planning to take the EX447 and hope the exam isn't as buggy as the course.

My findings, not mentioned by "littlebigfab":
* Guided Exercise 4.6 "Implementing Advanced Loops" requires IPA to be installed and configured on the utility server but the lab setup script doesn't do it. I installed IPA manually, then a couple of chapters later noticed it's installed by the lab setup script for exercise 6.4.

* Guided Exercise 4.8 "Working with Network Addresses Using Filters", step 1.2 says that "In this case, there is no explicit PTR record for the managed host, so an in-addr.arpa name is returned." which isn't true. All the lab servers have proper reverse DNS records pointing to their host names, for example:
10.250.25.172.in-addr.arpa. 3600 IN PTR servera.lab.example.com.
This address can be obtained in a playbook using the "dig" plugin like this:
address_dns: "{{ lookup('dig', server_address + '/PTR') }}"
The "ipaddr('revdns')" filter (which is the solution expected by the author) doesn't resolve the PTR record, it only converts an IP address to a PTR record name which then can be queried and resolves to the reverse DNS name.
Either the task description has to be changed / re-phrased or the expected solution.

* In exercise 6.6 the lab setup script fails to create the demo project (/var/lib/awx/projects/_4__demo_project) in the Tower because directory /root/ansible/DO447 doesn't exist on the workstation host. I had to manually download it from http://materials.example.com/classroom/ansible/DO447. It seems that it would only be downloaded if the "lab tower-install solve" was executed which no one ever asks to run. I figured it out looking at the lab setup scripts.

* The last and the WORST: Guided Exercise 9.2 "Creating a Project for Ansible Playbooks". The lab setup script doesn't create the git repository required for the exercise (ssh://git.lab.example.com/var/opt/gitlab/git-data/repositories/git/my_webservers_DEV.git) and the repo content doesn't even seem to exist in http://materials.example.com/git_repos or http://materials.example.com/classroom/ansible which contain the content for the git repositories and other materials for most of the other exercises.

I'm not happy with spending my time on analyzing and fixing the buggy scripts written by you, especially that this course isn't marked as "Early Access".

Daniel

Razique · ‎02-07-2020

Hey Daniel,

sorry to hear that you are having issues with the course. We do spend a lot of time testing and editing our courses to make sure we catch "everything". We also have two full end-to-end QA to make sure that we don't miss anything.

Unfortunately, when a course spans over 500 pages or so, mistakes, such as typos, are bound to happen, no matter how many sets of eyes we have on it. This is the same thing when you install a software, even if it is a paid one, there are issues, otherwise, there wouldn't be such a thing as bugfix releases.

For those issues that instructors or students catch down the road, we file a defect, before having team qualifying and priotirizing those defects.

I reached out to the team who wrote that course to let them know about the issues that you have mentionned, they will look into them and see how/when they will be able to provide a fix for that.

Let me know if you have any questions,

-razique

another-newbie · ‎02-26-2020

I had tremendous issue with DO447 lab.

It desinged so poorly. Later chapter lab depends on previous ones.
There is no clear direction where you should start over if lab does not work
lab environment constantly corrupted.
Lab basically needs provision every time you login. You need to start over your lab from day one.

I am not sure how can this training pass RedHat its own QA test.

another-newbie · ‎02-26-2020

Each time I issued redhat support ticket, it's helpless.

Khamid · ‎06-17-2020

Looks like there are some bugs in lab grading scripts. Expecially for review-cr3 garde

bschonec · ‎09-10-2020

In my case, the dynamic inventory Python script doesn't work because the tower server doesn't have python-ldap module and the FreeIPA/LDAP server doesn't have LDAP installed/listening. It makes it impossible to finish the comprehensive review.

Celebrian · ‎11-16-2020

Hi, i had this same problem, and found the answer here: https://learn.redhat.com/t5/Automation-Management-Ansible/DO447-has-anyone-gotten-the-dynamic-invent...
Tl;dr: You need to run the "lab advinventory-dynamic start" command to initialize the ldap server so that the dynamic inventory script works.

ricardo_jun · ‎01-08-2021

@bonnevil , @PhilSweany

Could you check their requests, please?

Thanks